Linux NFS development
 help / color / mirror / Atom feed
* [PATCH 0/9 v3] support admin-revocation of v4 state
@ 2023-11-17  2:18 NeilBrown
  2023-11-17  2:18 ` [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked() NeilBrown
                   ` (9 more replies)
  0 siblings, 10 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

This set adds a prequel series to my previous posting which addresses
some locking problems around ->sc_type and splits that field into
to separate fields: sc_type and sc_status.
The recovation code is modified to accomodate these changed.

Thanks
NeilBrown

 [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
 [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
 [PATCH 3/9] nfsd: split sc_status out of sc_type
 [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state
 [PATCH 5/9] nfsd: allow admin-revoked state to appear in
 [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
 [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed
 [PATCH 8/9] nfsd: allow open state ids to be revoked and then freed
 [PATCH 9/9] nfsd: allow delegation state ids to be revoked and then


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17 10:59   ` Jeff Layton
  2023-11-17 15:04   ` Chuck Lever
  2023-11-17  2:18 ` [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked() NeilBrown
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

The protocol for creating a new state in nfsd is to allocated the state
leaving it largely uninitialised, add that state to the ->cl_stateids
idr so as to reserve a state id, then complete initialisation of the
state and only set ->sc_type to non-zero once the state is fully
initialised.

If a state is found in the idr with ->sc_type == 0, it is ignored.
The ->cl_lock list is used to avoid races - it is held while checking
sc_type during lookup, and held when a non-zero value is stored in
->sc_type.

... except... hash_delegation_locked() finalises the initialisation of a
delegation state, but does NOT hold ->cl_lock.

So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
and so ensures there are no races (which are extremely unlikely in any
case).

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 65fd5510323a..6368788a7d4e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
 
 	lockdep_assert_held(&state_lock);
 	lockdep_assert_held(&fp->fi_lock);
+	lockdep_assert_held(&clp->cl_lock);
 
 	if (nfs4_delegation_exists(clp, fp))
 		return -EAGAIN;
@@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
 		goto out_unlock;
 
 	spin_lock(&state_lock);
+	spin_lock(&clp->cl_lock);
 	spin_lock(&fp->fi_lock);
 	if (fp->fi_had_conflict)
 		status = -EAGAIN;
 	else
 		status = hash_delegation_locked(dp, fp);
 	spin_unlock(&fp->fi_lock);
+	spin_unlock(&clp->cl_lock);
 	spin_unlock(&state_lock);
 
 	if (status)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
  2023-11-17  2:18 ` [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked() NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17 11:41   ` Jeff Layton
  2023-11-17  2:18 ` [PATCH 3/9] nfsd: split sc_status out of sc_type NeilBrown
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in
purpose.
REVOKED is used for NFSv4.1 states which have been revoked because the
lease has expired.  CLOSED is used in other cases.
The difference has two practical effects.
1/ REVOKED states are on the ->cl_revoked list
2/ REVOKED states result in nfserr_deleg_revoked from
   nfsd4_verify_open_stid() asnd nfsd4_validate_stateid while
   CLOSED states result in nfserr_bad_stid.

Currently a state that is being revoked is first set to "CLOSED" in
unhash_delegation_locked(), then possibly to "REVOKED" in
revoke_delegation(), at which point it is added to the cl_revoked list.

It is possible that a stateid test could see the CLOSED state
which really should be REVOKED, and so return the wrong error code.  So
it is safest to remove this window of inconsistency.

With this patch, unhash_delegation_locked() always set the state
correctly, and revoke_delegation() no longer changes the state.

Also remove a redundant test on minorversion when
NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion
is non-zero.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 20 ++++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 6368788a7d4e..7469583382fb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1334,7 +1334,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
 }
 
 static bool
-unhash_delegation_locked(struct nfs4_delegation *dp)
+unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
 {
 	struct nfs4_file *fp = dp->dl_stid.sc_file;
 
@@ -1343,7 +1343,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp)
 	if (!delegation_hashed(dp))
 		return false;
 
-	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
+	if (dp->dl_stid.sc_client->cl_minorversion == 0)
+		type = NFS4_CLOSED_DELEG_STID;
+	dp->dl_stid.sc_type = type;
 	/* Ensure that deleg break won't try to requeue it */
 	++dp->dl_time;
 	spin_lock(&fp->fi_lock);
@@ -1359,7 +1361,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
 	bool unhashed;
 
 	spin_lock(&state_lock);
-	unhashed = unhash_delegation_locked(dp);
+	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
 	spin_unlock(&state_lock);
 	if (unhashed)
 		destroy_unhashed_deleg(dp);
@@ -1373,9 +1375,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 
 	trace_nfsd_stid_revoke(&dp->dl_stid);
 
-	if (clp->cl_minorversion) {
+	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
 		spin_lock(&clp->cl_lock);
-		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
 		refcount_inc(&dp->dl_stid.sc_count);
 		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
 		spin_unlock(&clp->cl_lock);
@@ -2234,7 +2235,7 @@ __destroy_client(struct nfs4_client *clp)
 	spin_lock(&state_lock);
 	while (!list_empty(&clp->cl_delegations)) {
 		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
-		WARN_ON(!unhash_delegation_locked(dp));
+		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
@@ -5197,8 +5198,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
 		goto out;
 	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
 		nfs4_put_stid(&deleg->dl_stid);
-		if (cl->cl_minorversion)
-			status = nfserr_deleg_revoked;
+		status = nfserr_deleg_revoked;
 		goto out;
 	}
 	flags = share_access_to_flags(open->op_share_access);
@@ -6235,7 +6235,7 @@ nfs4_laundromat(struct nfsd_net *nn)
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
 		if (!state_expired(&lt, dp->dl_time))
 			break;
-		WARN_ON(!unhash_delegation_locked(dp));
+		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
@@ -8350,7 +8350,7 @@ nfs4_state_shutdown_net(struct net *net)
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
-		WARN_ON(!unhash_delegation_locked(dp));
+		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/9] nfsd: split sc_status out of sc_type
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
  2023-11-17  2:18 ` [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked() NeilBrown
  2023-11-17  2:18 ` [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked() NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17 19:52   ` Chuck Lever
  2023-11-17  2:18 ` [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state NeilBrown
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

sc_type identifies the type of a state - open, lock, deleg, layout - and
also the status of a state - closed or revoked.

This is a bit untidy and could get worse when "admin-revoked" states are
added.  So clean it up.

With this patch, the type is now all that is stored in sc_type.  This is
zero when the state is first added to ->cl_stateids (causing it to be
ignored), and is then set appropriately once it is fully initialised.
It is set under ->cl_lock to ensure atomicity w.r.t lookup.  It is now
never cleared.

sc_type is still a bit-set even though at most one bit is set.  This allows
lookup functions to be given a bitmap of acceptable types.

cl_type is now an unsigned short rather than char.  There is no value in
restricting to just 8 bits.  When passed to a function or stored in local
variable, "unsigned int" is used.

The status is stored in a separate short named "cl_status".  It has two
flags: NFS4_STID_CLOSED and NFS4_STID_REVOKED.
CLOSED combines NFS4_CLOSED_STID, NFS4_CLOSED_DELEG_STID, and is used
for LOCK_STID and LAYOUT_STID instead of setting the sc_type to zero.
These flags are only ever set, never cleared.
For deleg stateids they are set under the global state_lock.
For open and lock stateids they are set under ->cl_lock.
For layout stateids they are set under ->ls_lock

nfs4_unhash_stid() has been removed, and we never set sc_type = 0.  This
was only used for LOCK and LAYOUT stids and they now use
NFS4_STID_CLOSED.

Also TRACE_DEFINE_NUM() calls for the various STD #define have been
removed because these things are not enums, and so that call is
incorrect.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4layouts.c |   6 +-
 fs/nfsd/nfs4state.c   | 165 +++++++++++++++++++++---------------------
 fs/nfsd/state.h       |  40 +++++++---
 fs/nfsd/trace.h       |  23 +++---
 4 files changed, 122 insertions(+), 112 deletions(-)

diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 5e8096bc5eaa..eb8bb05a8df5 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -269,13 +269,13 @@ nfsd4_preprocess_layout_stateid(struct svc_rqst *rqstp,
 {
 	struct nfs4_layout_stateid *ls;
 	struct nfs4_stid *stid;
-	unsigned char typemask = NFS4_LAYOUT_STID;
+	unsigned int typemask = NFS4_LAYOUT_STID;
 	__be32 status;
 
 	if (create)
 		typemask |= (NFS4_OPEN_STID | NFS4_LOCK_STID | NFS4_DELEG_STID);
 
-	status = nfsd4_lookup_stateid(cstate, stateid, typemask, &stid,
+	status = nfsd4_lookup_stateid(cstate, stateid, typemask, 0, &stid,
 			net_generic(SVC_NET(rqstp), nfsd_net_id));
 	if (status)
 		goto out;
@@ -518,7 +518,7 @@ nfsd4_return_file_layouts(struct svc_rqst *rqstp,
 		lrp->lrs_present = true;
 	} else {
 		trace_nfsd_layoutstate_unhash(&ls->ls_stid.sc_stateid);
-		nfs4_unhash_stid(&ls->ls_stid);
+		ls->ls_stid.sc_status |= NFS4_STID_CLOSED;
 		lrp->lrs_present = false;
 	}
 	spin_unlock(&ls->ls_lock);
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 7469583382fb..b1fdd0ee0f5c 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1265,11 +1265,6 @@ static void destroy_unhashed_deleg(struct nfs4_delegation *dp)
 	nfs4_put_stid(&dp->dl_stid);
 }
 
-void nfs4_unhash_stid(struct nfs4_stid *s)
-{
-	s->sc_type = 0;
-}
-
 /**
  * nfs4_delegation_exists - Discover if this delegation already exists
  * @clp:     a pointer to the nfs4_client we're granting a delegation to
@@ -1334,7 +1329,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
 }
 
 static bool
-unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
+unhash_delegation_locked(struct nfs4_delegation *dp, unsigned int statusmask)
 {
 	struct nfs4_file *fp = dp->dl_stid.sc_file;
 
@@ -1344,8 +1339,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
 		return false;
 
 	if (dp->dl_stid.sc_client->cl_minorversion == 0)
-		type = NFS4_CLOSED_DELEG_STID;
-	dp->dl_stid.sc_type = type;
+		statusmask = NFS4_STID_CLOSED;
+	dp->dl_stid.sc_status |= statusmask;
+
 	/* Ensure that deleg break won't try to requeue it */
 	++dp->dl_time;
 	spin_lock(&fp->fi_lock);
@@ -1361,7 +1357,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
 	bool unhashed;
 
 	spin_lock(&state_lock);
-	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
+	unhashed = unhash_delegation_locked(dp, NFS4_STID_CLOSED);
 	spin_unlock(&state_lock);
 	if (unhashed)
 		destroy_unhashed_deleg(dp);
@@ -1375,7 +1371,7 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 
 	trace_nfsd_stid_revoke(&dp->dl_stid);
 
-	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
+	if (dp->dl_stid.sc_status & NFS4_STID_REVOKED) {
 		spin_lock(&clp->cl_lock);
 		refcount_inc(&dp->dl_stid.sc_count);
 		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
@@ -1384,8 +1380,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 	destroy_unhashed_deleg(dp);
 }
 
-/* 
- * SETCLIENTID state 
+/*
+ * SETCLIENTID state
  */
 
 static unsigned int clientid_hashval(u32 id)
@@ -1548,7 +1544,7 @@ static bool unhash_lock_stateid(struct nfs4_ol_stateid *stp)
 	if (!unhash_ol_stateid(stp))
 		return false;
 	list_del_init(&stp->st_locks);
-	nfs4_unhash_stid(&stp->st_stid);
+	stp->st_stid.sc_status |= NFS4_STID_CLOSED;
 	return true;
 }
 
@@ -1627,6 +1623,7 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
 	LIST_HEAD(reaplist);
 
 	spin_lock(&stp->st_stid.sc_client->cl_lock);
+	stp->st_stid.sc_status |= NFS4_STID_CLOSED;
 	if (unhash_open_stateid(stp, &reaplist))
 		put_ol_stateid_locked(stp, &reaplist);
 	spin_unlock(&stp->st_stid.sc_client->cl_lock);
@@ -2235,7 +2232,7 @@ __destroy_client(struct nfs4_client *clp)
 	spin_lock(&state_lock);
 	while (!list_empty(&clp->cl_delegations)) {
 		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
-		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
+		WARN_ON(!unhash_delegation_locked(dp, NFS4_STID_CLOSED));
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
@@ -2467,14 +2464,16 @@ find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
 }
 
 static struct nfs4_stid *
-find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
+find_stateid_by_type(struct nfs4_client *cl, stateid_t *t,
+		     unsigned int typemask, unsigned int ok_states)
 {
 	struct nfs4_stid *s;
 
 	spin_lock(&cl->cl_lock);
 	s = find_stateid_locked(cl, t);
 	if (s != NULL) {
-		if (typemask & s->sc_type)
+		if ((s->sc_status & ~ok_states) == 0 &&
+		    (typemask & s->sc_type))
 			refcount_inc(&s->sc_count);
 		else
 			s = NULL;
@@ -4584,7 +4583,8 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 			continue;
 		if (local->st_stateowner != &oo->oo_owner)
 			continue;
-		if (local->st_stid.sc_type == NFS4_OPEN_STID) {
+		if (local->st_stid.sc_type == NFS4_OPEN_STID &&
+		    !local->st_stid.sc_status) {
 			ret = local;
 			refcount_inc(&ret->st_stid.sc_count);
 			break;
@@ -4598,17 +4598,10 @@ nfsd4_verify_open_stid(struct nfs4_stid *s)
 {
 	__be32 ret = nfs_ok;
 
-	switch (s->sc_type) {
-	default:
-		break;
-	case 0:
-	case NFS4_CLOSED_STID:
-	case NFS4_CLOSED_DELEG_STID:
-		ret = nfserr_bad_stateid;
-		break;
-	case NFS4_REVOKED_DELEG_STID:
+	if (s->sc_status & NFS4_STID_REVOKED)
 		ret = nfserr_deleg_revoked;
-	}
+	else if (s->sc_status & NFS4_STID_CLOSED)
+		ret = nfserr_bad_stateid;
 	return ret;
 }
 
@@ -4921,9 +4914,9 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
 
 	trace_nfsd_cb_recall_done(&dp->dl_stid.sc_stateid, task);
 
-	if (dp->dl_stid.sc_type == NFS4_CLOSED_DELEG_STID ||
-	    dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID)
-	        return 1;
+	if (dp->dl_stid.sc_status)
+		/* CLOSED or REVOKED */
+		return 1;
 
 	switch (task->tk_status) {
 	case 0:
@@ -5168,12 +5161,12 @@ static int share_access_to_flags(u32 share_access)
 	return share_access == NFS4_SHARE_ACCESS_READ ? RD_STATE : WR_STATE;
 }
 
-static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, stateid_t *s)
+static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl,
+						  stateid_t *s)
 {
 	struct nfs4_stid *ret;
 
-	ret = find_stateid_by_type(cl, s,
-				NFS4_DELEG_STID|NFS4_REVOKED_DELEG_STID);
+	ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID, NFS4_STID_REVOKED);
 	if (!ret)
 		return NULL;
 	return delegstateid(ret);
@@ -5196,7 +5189,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
 	deleg = find_deleg_stateid(cl, &open->op_delegate_stateid);
 	if (deleg == NULL)
 		goto out;
-	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
+	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
 		nfs4_put_stid(&deleg->dl_stid);
 		status = nfserr_deleg_revoked;
 		goto out;
@@ -5843,7 +5836,6 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
 	} else {
 		status = nfs4_get_vfs_file(rqstp, fp, current_fh, stp, open, true);
 		if (status) {
-			stp->st_stid.sc_type = NFS4_CLOSED_STID;
 			release_open_stateid(stp);
 			mutex_unlock(&stp->st_mutex);
 			goto out;
@@ -6235,7 +6227,7 @@ nfs4_laundromat(struct nfsd_net *nn)
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
 		if (!state_expired(&lt, dp->dl_time))
 			break;
-		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
+		WARN_ON(!unhash_delegation_locked(dp, NFS4_STID_REVOKED));
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
@@ -6474,22 +6466,20 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 	status = nfsd4_stid_check_stateid_generation(stateid, s, 1);
 	if (status)
 		goto out_unlock;
+	status = nfsd4_verify_open_stid(s);
+	if (status)
+		goto out_unlock;
+
 	switch (s->sc_type) {
 	case NFS4_DELEG_STID:
 		status = nfs_ok;
 		break;
-	case NFS4_REVOKED_DELEG_STID:
-		status = nfserr_deleg_revoked;
-		break;
 	case NFS4_OPEN_STID:
 	case NFS4_LOCK_STID:
 		status = nfsd4_check_openowner_confirmed(openlockstateid(s));
 		break;
 	default:
 		printk("unknown stateid type %x\n", s->sc_type);
-		fallthrough;
-	case NFS4_CLOSED_STID:
-	case NFS4_CLOSED_DELEG_STID:
 		status = nfserr_bad_stateid;
 	}
 out_unlock:
@@ -6499,7 +6489,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 
 __be32
 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
-		     stateid_t *stateid, unsigned char typemask,
+		     stateid_t *stateid,
+		     unsigned int typemask, unsigned int statusmask,
 		     struct nfs4_stid **s, struct nfsd_net *nn)
 {
 	__be32 status;
@@ -6510,10 +6501,13 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 	 *  only return revoked delegations if explicitly asked.
 	 *  otherwise we report revoked or bad_stateid status.
 	 */
-	if (typemask & NFS4_REVOKED_DELEG_STID)
+	if (statusmask & NFS4_STID_REVOKED)
 		return_revoked = true;
-	else if (typemask & NFS4_DELEG_STID)
-		typemask |= NFS4_REVOKED_DELEG_STID;
+	if (typemask & NFS4_DELEG_STID)
+		/* Always allow REVOKED for DELEG so we can
+		 * retturn the appropriate error.
+		 */
+		statusmask |= NFS4_STID_REVOKED;
 
 	if (ZERO_STATEID(stateid) || ONE_STATEID(stateid) ||
 		CLOSE_STATEID(stateid))
@@ -6526,14 +6520,12 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 	}
 	if (status)
 		return status;
-	stid = find_stateid_by_type(cstate->clp, stateid, typemask);
+	stid = find_stateid_by_type(cstate->clp, stateid, typemask, statusmask);
 	if (!stid)
 		return nfserr_bad_stateid;
-	if ((stid->sc_type == NFS4_REVOKED_DELEG_STID) && !return_revoked) {
+	if ((stid->sc_status & NFS4_STID_REVOKED) && !return_revoked) {
 		nfs4_put_stid(stid);
-		if (cstate->minorversion)
-			return nfserr_deleg_revoked;
-		return nfserr_bad_stateid;
+		return nfserr_deleg_revoked;
 	}
 	*s = stid;
 	return nfs_ok;
@@ -6544,7 +6536,7 @@ nfs4_find_file(struct nfs4_stid *s, int flags)
 {
 	struct nfsd_file *ret = NULL;
 
-	if (!s)
+	if (!s || s->sc_status)
 		return NULL;
 
 	switch (s->sc_type) {
@@ -6667,7 +6659,8 @@ static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
 		goto out;
 
 	*stid = find_stateid_by_type(found, &cps->cp_p_stateid,
-			NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID);
+				     NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
+				     0);
 	if (*stid)
 		status = nfs_ok;
 	else
@@ -6725,7 +6718,7 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 
 	status = nfsd4_lookup_stateid(cstate, stateid,
 				NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
-				&s, nn);
+				0, &s, nn);
 	if (status == nfserr_bad_stateid)
 		status = find_cpntf_state(nn, stateid, &s);
 	if (status)
@@ -6743,9 +6736,6 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 	case NFS4_LOCK_STID:
 		status = nfs4_check_olstateid(openlockstateid(s), flags);
 		break;
-	default:
-		status = nfserr_bad_stateid;
-		break;
 	}
 	if (status)
 		goto out;
@@ -6824,11 +6814,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 
 	spin_lock(&cl->cl_lock);
 	s = find_stateid_locked(cl, stateid);
-	if (!s)
+	if (!s || s->sc_status & NFS4_STID_CLOSED)
 		goto out_unlock;
 	spin_lock(&s->sc_lock);
 	switch (s->sc_type) {
 	case NFS4_DELEG_STID:
+		if (s->sc_status & NFS4_STID_REVOKED) {
+			spin_unlock(&s->sc_lock);
+			dp = delegstateid(s);
+			list_del_init(&dp->dl_recall_lru);
+			spin_unlock(&cl->cl_lock);
+			nfs4_put_stid(s);
+			ret = nfs_ok;
+			goto out;
+		}
 		ret = nfserr_locks_held;
 		break;
 	case NFS4_OPEN_STID:
@@ -6843,14 +6842,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		spin_unlock(&cl->cl_lock);
 		ret = nfsd4_free_lock_stateid(stateid, s);
 		goto out;
-	case NFS4_REVOKED_DELEG_STID:
-		spin_unlock(&s->sc_lock);
-		dp = delegstateid(s);
-		list_del_init(&dp->dl_recall_lru);
-		spin_unlock(&cl->cl_lock);
-		nfs4_put_stid(s);
-		ret = nfs_ok;
-		goto out;
 	/* Default falls through and returns nfserr_bad_stateid */
 	}
 	spin_unlock(&s->sc_lock);
@@ -6893,6 +6884,7 @@ static __be32 nfs4_seqid_op_checks(struct nfsd4_compound_state *cstate, stateid_
  * @seqid: seqid (provided by client)
  * @stateid: stateid (provided by client)
  * @typemask: mask of allowable types for this operation
+ * @statusmask: mask of allowed states: 0 or STID_CLOSED
  * @stpp: return pointer for the stateid found
  * @nn: net namespace for request
  *
@@ -6902,7 +6894,8 @@ static __be32 nfs4_seqid_op_checks(struct nfsd4_compound_state *cstate, stateid_
  */
 static __be32
 nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
-			 stateid_t *stateid, char typemask,
+			 stateid_t *stateid,
+			 unsigned int typemask, unsigned int statusmask,
 			 struct nfs4_ol_stateid **stpp,
 			 struct nfsd_net *nn)
 {
@@ -6913,7 +6906,8 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
 	trace_nfsd_preprocess(seqid, stateid);
 
 	*stpp = NULL;
-	status = nfsd4_lookup_stateid(cstate, stateid, typemask, &s, nn);
+	status = nfsd4_lookup_stateid(cstate, stateid,
+				      typemask, statusmask, &s, nn);
 	if (status)
 		return status;
 	stp = openlockstateid(s);
@@ -6935,7 +6929,7 @@ static __be32 nfs4_preprocess_confirmed_seqid_op(struct nfsd4_compound_state *cs
 	struct nfs4_ol_stateid *stp;
 
 	status = nfs4_preprocess_seqid_op(cstate, seqid, stateid,
-						NFS4_OPEN_STID, &stp, nn);
+					  NFS4_OPEN_STID, 0, &stp, nn);
 	if (status)
 		return status;
 	oo = openowner(stp->st_stateowner);
@@ -6966,8 +6960,8 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		return status;
 
 	status = nfs4_preprocess_seqid_op(cstate,
-					oc->oc_seqid, &oc->oc_req_stateid,
-					NFS4_OPEN_STID, &stp, nn);
+					  oc->oc_seqid, &oc->oc_req_stateid,
+					  NFS4_OPEN_STID, 0, &stp, nn);
 	if (status)
 		goto out;
 	oo = openowner(stp->st_stateowner);
@@ -7097,18 +7091,20 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	struct net *net = SVC_NET(rqstp);
 	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
 
-	dprintk("NFSD: nfsd4_close on file %pd\n", 
+	dprintk("NFSD: nfsd4_close on file %pd\n",
 			cstate->current_fh.fh_dentry);
 
 	status = nfs4_preprocess_seqid_op(cstate, close->cl_seqid,
-					&close->cl_stateid,
-					NFS4_OPEN_STID|NFS4_CLOSED_STID,
-					&stp, nn);
+					  &close->cl_stateid,
+					  NFS4_OPEN_STID, NFS4_STID_CLOSED,
+					  &stp, nn);
 	nfsd4_bump_seqid(cstate, status);
 	if (status)
-		goto out; 
+		goto out;
 
-	stp->st_stid.sc_type = NFS4_CLOSED_STID;
+	spin_lock(&stp->st_stid.sc_client->cl_lock);
+	stp->st_stid.sc_status |= NFS4_STID_CLOSED;
+	spin_unlock(&stp->st_stid.sc_client->cl_lock);
 
 	/*
 	 * Technically we don't _really_ have to increment or copy it, since
@@ -7150,7 +7146,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
 		return status;
 
-	status = nfsd4_lookup_stateid(cstate, stateid, NFS4_DELEG_STID, &s, nn);
+	status = nfsd4_lookup_stateid(cstate, stateid, NFS4_DELEG_STID, 0, &s, nn);
 	if (status)
 		goto out;
 	dp = delegstateid(s);
@@ -7604,9 +7600,10 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 							&lock_stp, &new);
 	} else {
 		status = nfs4_preprocess_seqid_op(cstate,
-				       lock->lk_old_lock_seqid,
-				       &lock->lk_old_lock_stateid,
-				       NFS4_LOCK_STID, &lock_stp, nn);
+						  lock->lk_old_lock_seqid,
+						  &lock->lk_old_lock_stateid,
+						  NFS4_LOCK_STID, 0, &lock_stp,
+						  nn);
 	}
 	if (status)
 		goto out;
@@ -7919,8 +7916,8 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		 return nfserr_inval;
 
 	status = nfs4_preprocess_seqid_op(cstate, locku->lu_seqid,
-					&locku->lu_stateid, NFS4_LOCK_STID,
-					&stp, nn);
+					  &locku->lu_stateid, NFS4_LOCK_STID, 0,
+					  &stp, nn);
 	if (status)
 		goto out;
 	nf = find_any_file(stp->st_stid.sc_file);
@@ -8350,7 +8347,7 @@ nfs4_state_shutdown_net(struct net *net)
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
-		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
+		WARN_ON(!unhash_delegation_locked(dp, NFS4_STID_CLOSED));
 		list_add(&dp->dl_recall_lru, &reaplist);
 	}
 	spin_unlock(&state_lock);
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index f96eaa8e9413..22f9eb3986d2 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -88,17 +88,33 @@ struct nfsd4_callback_ops {
  */
 struct nfs4_stid {
 	refcount_t		sc_count;
-#define NFS4_OPEN_STID 1
-#define NFS4_LOCK_STID 2
-#define NFS4_DELEG_STID 4
-/* For an open stateid kept around *only* to process close replays: */
-#define NFS4_CLOSED_STID 8
+
+	/* A new stateid is added to the cl_stateids idr early before it
+	 * is fully initialised.  Its sc_type is then zero.  After
+	 * initialisation the sc_type it set under cl_lock, and then
+	 * never changes.
+	 */
+#define NFS4_OPEN_STID		BIT(0)
+#define NFS4_LOCK_STID		BIT(1)
+#define NFS4_DELEG_STID		BIT(2)
+#define NFS4_LAYOUT_STID	BIT(3)
+	unsigned short		sc_type;
+
+/* state_lock protects sc_status for delegation stateids.
+ * ->cl_lock protects sc_status for open and lock stateids.
+ * ->st_mutex also protect sc_status for open stateids.
+ * ->ls_lock protects sc_status for layout stateids.
+ */
+/*
+ * For an open stateid kept around *only* to process close replays.
+ * For deleg stateid, kept in idr until last reference is dropped.
+ */
+#define NFS4_STID_CLOSED	BIT(0)
 /* For a deleg stateid kept around only to process free_stateid's: */
-#define NFS4_REVOKED_DELEG_STID 16
-#define NFS4_CLOSED_DELEG_STID 32
-#define NFS4_LAYOUT_STID 64
+#define NFS4_STID_REVOKED	BIT(1)
+	unsigned short		sc_status;
+
 	struct list_head	sc_cp_list;
-	unsigned char		sc_type;
 	stateid_t		sc_stateid;
 	spinlock_t		sc_lock;
 	struct nfs4_client	*sc_client;
@@ -694,15 +710,15 @@ extern __be32 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
 		stateid_t *stateid, int flags, struct nfsd_file **filp,
 		struct nfs4_stid **cstid);
 __be32 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
-		     stateid_t *stateid, unsigned char typemask,
-		     struct nfs4_stid **s, struct nfsd_net *nn);
+			    stateid_t *stateid, unsigned int typemask,
+			    unsigned int statusmask,
+			    struct nfs4_stid **s, struct nfsd_net *nn);
 struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *slab,
 				  void (*sc_free)(struct nfs4_stid *));
 int nfs4_init_copy_state(struct nfsd_net *nn, struct nfsd4_copy *copy);
 void nfs4_free_copy_state(struct nfsd4_copy *copy);
 struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
 			struct nfs4_stid *p_stid);
-void nfs4_unhash_stid(struct nfs4_stid *s);
 void nfs4_put_stid(struct nfs4_stid *s);
 void nfs4_inc_and_copy_stateid(stateid_t *dst, struct nfs4_stid *stid);
 void nfs4_remove_reclaim_record(struct nfs4_client_reclaim *, struct nfsd_net *);
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index fbc0ccb40424..54d82e14d724 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -641,24 +641,18 @@ DEFINE_EVENT(nfsd_stateseqid_class, nfsd_##name, \
 DEFINE_STATESEQID_EVENT(preprocess);
 DEFINE_STATESEQID_EVENT(open_confirm);
 
-TRACE_DEFINE_ENUM(NFS4_OPEN_STID);
-TRACE_DEFINE_ENUM(NFS4_LOCK_STID);
-TRACE_DEFINE_ENUM(NFS4_DELEG_STID);
-TRACE_DEFINE_ENUM(NFS4_CLOSED_STID);
-TRACE_DEFINE_ENUM(NFS4_REVOKED_DELEG_STID);
-TRACE_DEFINE_ENUM(NFS4_CLOSED_DELEG_STID);
-TRACE_DEFINE_ENUM(NFS4_LAYOUT_STID);
-
 #define show_stid_type(x)						\
 	__print_flags(x, "|",						\
 		{ NFS4_OPEN_STID,		"OPEN" },		\
 		{ NFS4_LOCK_STID,		"LOCK" },		\
 		{ NFS4_DELEG_STID,		"DELEG" },		\
-		{ NFS4_CLOSED_STID,		"CLOSED" },		\
-		{ NFS4_REVOKED_DELEG_STID,	"REVOKED" },		\
-		{ NFS4_CLOSED_DELEG_STID,	"CLOSED_DELEG" },	\
 		{ NFS4_LAYOUT_STID,		"LAYOUT" })
 
+#define show_stid_status(x)						\
+	__print_flags(x, "|",						\
+		{ NFS4_STID_CLOSED,		"CLOSED" },		\
+		{ NFS4_STID_REVOKED,		"REVOKED" })		\
+
 DECLARE_EVENT_CLASS(nfsd_stid_class,
 	TP_PROTO(
 		const struct nfs4_stid *stid
@@ -666,6 +660,7 @@ DECLARE_EVENT_CLASS(nfsd_stid_class,
 	TP_ARGS(stid),
 	TP_STRUCT__entry(
 		__field(unsigned long, sc_type)
+		__field(unsigned long, sc_status)
 		__field(int, sc_count)
 		__field(u32, cl_boot)
 		__field(u32, cl_id)
@@ -676,16 +671,18 @@ DECLARE_EVENT_CLASS(nfsd_stid_class,
 		const stateid_t *stp = &stid->sc_stateid;
 
 		__entry->sc_type = stid->sc_type;
+		__entry->sc_status = stid->sc_status;
 		__entry->sc_count = refcount_read(&stid->sc_count);
 		__entry->cl_boot = stp->si_opaque.so_clid.cl_boot;
 		__entry->cl_id = stp->si_opaque.so_clid.cl_id;
 		__entry->si_id = stp->si_opaque.so_id;
 		__entry->si_generation = stp->si_generation;
 	),
-	TP_printk("client %08x:%08x stateid %08x:%08x ref=%d type=%s",
+	TP_printk("client %08x:%08x stateid %08x:%08x ref=%d type=%s state=%s",
 		__entry->cl_boot, __entry->cl_id,
 		__entry->si_id, __entry->si_generation,
-		__entry->sc_count, show_stid_type(__entry->sc_type)
+		__entry->sc_count, show_stid_type(__entry->sc_type),
+		show_stid_status(__entry->sc_status)
 	)
 );
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (2 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 3/9] nfsd: split sc_status out of sc_type NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17  2:18 ` [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states NeilBrown
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

The NFSv4 protocol allows state to be revoked by the admin and has error
codes which allow this to be communicated to the client.

This patch
 - introduces a new state-id status NFS4_STID_ADMIN_REVOKE
   which can be set on open, lock, or delegation state.
 - reports NFS4ERR_ADMIN_REVOKED when these are accessed
 - introduces a per-client counter of these states and returns
   SEQ4_STATUS_ADMIN_STATE_REVOKED when the counter is not zero.
   Decrements this when freeing any admin-revoked state.
 - introduces stub code to find all interesting states for a given
   superblock so they can be revoked via the 'unlock_filesystem'
   file in /proc/fs/nfsd/
   No actual states are handled yet.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 71 ++++++++++++++++++++++++++++++++++++++++++++-
 fs/nfsd/nfsctl.c    |  1 +
 fs/nfsd/nfsd.h      |  1 +
 fs/nfsd/state.h     | 10 +++++++
 fs/nfsd/trace.h     |  3 +-
 5 files changed, 84 insertions(+), 2 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index b1fdd0ee0f5c..a2b3320a6ba8 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1215,6 +1215,8 @@ nfs4_put_stid(struct nfs4_stid *s)
 		return;
 	}
 	idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
+	if (s->sc_status & NFS4_STID_ADMIN_REVOKED)
+		atomic_dec(&s->sc_client->cl_admin_revoked);
 	nfs4_free_cpntf_statelist(clp->net, s);
 	spin_unlock(&clp->cl_lock);
 	s->sc_free(s);
@@ -1534,6 +1536,8 @@ static void put_ol_stateid_locked(struct nfs4_ol_stateid *stp,
 	}
 
 	idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
+	if (s->sc_status & NFS4_STID_ADMIN_REVOKED)
+		atomic_dec(&s->sc_client->cl_admin_revoked);
 	list_add(&stp->st_locks, reaplist);
 }
 
@@ -1679,6 +1683,54 @@ static void release_openowner(struct nfs4_openowner *oo)
 	nfs4_put_stateowner(&oo->oo_owner);
 }
 
+static struct nfs4_stid *find_one_sb_stid(struct nfs4_client *clp,
+					  struct super_block *sb,
+					  unsigned int sc_types)
+{
+	unsigned long id, tmp;
+	struct nfs4_stid *stid;
+
+	spin_lock(&clp->cl_lock);
+	idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
+		if ((stid->sc_type & sc_types) &&
+		    stid->sc_status == 0 &&
+		    stid->sc_file->fi_inode->i_sb == sb) {
+			refcount_inc(&stid->sc_count);
+			break;
+		}
+	spin_unlock(&clp->cl_lock);
+	return stid;
+}
+
+void nfsd4_revoke_states(struct net *net, struct super_block *sb)
+{
+	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
+	unsigned int idhashval;
+	unsigned int sc_types;
+
+	sc_types = 0;
+
+	spin_lock(&nn->client_lock);
+	for (idhashval = 0; idhashval < CLIENT_HASH_MASK; idhashval++) {
+		struct list_head *head = &nn->conf_id_hashtbl[idhashval];
+		struct nfs4_client *clp;
+	retry:
+		list_for_each_entry(clp, head, cl_idhash) {
+			struct nfs4_stid *stid = find_one_sb_stid(clp, sb,
+								  sc_types);
+			if (stid) {
+				spin_unlock(&nn->client_lock);
+				switch (stid->sc_type) {
+				}
+				nfs4_put_stid(stid);
+				spin_lock(&nn->client_lock);
+				goto retry;
+			}
+		}
+	}
+	spin_unlock(&nn->client_lock);
+}
+
 static inline int
 hash_sessionid(struct nfs4_sessionid *sessionid)
 {
@@ -2550,6 +2602,8 @@ static int client_info_show(struct seq_file *m, void *v)
 	}
 	seq_printf(m, "callback state: %s\n", cb_state2str(clp->cl_cb_state));
 	seq_printf(m, "callback address: %pISpc\n", &clp->cl_cb_conn.cb_addr);
+	seq_printf(m, "admin-revoked states: %d\n",
+		   atomic_read(&clp->cl_admin_revoked));
 	drop_client(clp);
 
 	return 0;
@@ -4109,6 +4163,8 @@ nfsd4_sequence(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	}
 	if (!list_empty(&clp->cl_revoked))
 		seq->status_flags |= SEQ4_STATUS_RECALLABLE_STATE_REVOKED;
+	if (atomic_read(&clp->cl_admin_revoked))
+		seq->status_flags |= SEQ4_STATUS_ADMIN_STATE_REVOKED;
 out_no_session:
 	if (conn)
 		free_conn(conn);
@@ -4598,7 +4654,9 @@ nfsd4_verify_open_stid(struct nfs4_stid *s)
 {
 	__be32 ret = nfs_ok;
 
-	if (s->sc_status & NFS4_STID_REVOKED)
+	if (s->sc_status & NFS4_STID_ADMIN_REVOKED)
+		ret = nfserr_admin_revoked;
+	else if (s->sc_status & NFS4_STID_REVOKED)
 		ret = nfserr_deleg_revoked;
 	else if (s->sc_status & NFS4_STID_CLOSED)
 		ret = nfserr_bad_stateid;
@@ -5189,6 +5247,11 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
 	deleg = find_deleg_stateid(cl, &open->op_delegate_stateid);
 	if (deleg == NULL)
 		goto out;
+	if (deleg->dl_stid.sc_status & NFS4_STID_ADMIN_REVOKED) {
+		nfs4_put_stid(&deleg->dl_stid);
+		status = nfserr_admin_revoked;
+		goto out;
+	}
 	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
 		nfs4_put_stid(&deleg->dl_stid);
 		status = nfserr_deleg_revoked;
@@ -6509,6 +6572,8 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		 */
 		statusmask |= NFS4_STID_REVOKED;
 
+	statusmask |= NFS4_STID_ADMIN_REVOKED;
+
 	if (ZERO_STATEID(stateid) || ONE_STATEID(stateid) ||
 		CLOSE_STATEID(stateid))
 		return nfserr_bad_stateid;
@@ -6527,6 +6592,10 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		nfs4_put_stid(stid);
 		return nfserr_deleg_revoked;
 	}
+	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
+		nfs4_put_stid(stid);
+		return nfserr_admin_revoked;
+	}
 	*s = stid;
 	return nfs_ok;
 }
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 739ed5bf71cd..50368eec86b0 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -281,6 +281,7 @@ static ssize_t write_unlock_fs(struct file *file, char *buf, size_t size)
 	 * 3.  Is that directory the root of an exported file system?
 	 */
 	error = nlmsvc_unlock_all_by_sb(path.dentry->d_sb);
+	nfsd4_revoke_states(netns(file), path.dentry->d_sb);
 
 	path_put(&path);
 	return error;
diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
index f5ff42f41ee7..d46203eac3c8 100644
--- a/fs/nfsd/nfsd.h
+++ b/fs/nfsd/nfsd.h
@@ -280,6 +280,7 @@ void		nfsd_lockd_shutdown(void);
 #define	nfserr_no_grace		cpu_to_be32(NFSERR_NO_GRACE)
 #define	nfserr_reclaim_bad	cpu_to_be32(NFSERR_RECLAIM_BAD)
 #define	nfserr_badname		cpu_to_be32(NFSERR_BADNAME)
+#define	nfserr_admin_revoked	cpu_to_be32(NFS4ERR_ADMIN_REVOKED)
 #define	nfserr_cb_path_down	cpu_to_be32(NFSERR_CB_PATH_DOWN)
 #define	nfserr_locked		cpu_to_be32(NFSERR_LOCKED)
 #define	nfserr_wrongsec		cpu_to_be32(NFSERR_WRONGSEC)
diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
index 22f9eb3986d2..18bc2889136c 100644
--- a/fs/nfsd/state.h
+++ b/fs/nfsd/state.h
@@ -112,6 +112,7 @@ struct nfs4_stid {
 #define NFS4_STID_CLOSED	BIT(0)
 /* For a deleg stateid kept around only to process free_stateid's: */
 #define NFS4_STID_REVOKED	BIT(1)
+#define NFS4_STID_ADMIN_REVOKED	BIT(2)
 	unsigned short		sc_status;
 
 	struct list_head	sc_cp_list;
@@ -388,6 +389,7 @@ struct nfs4_client {
 	clientid_t		cl_clientid;	/* generated by server */
 	nfs4_verifier		cl_confirm;	/* generated by server */
 	u32			cl_minorversion;
+	atomic_t		cl_admin_revoked; /* count of admin-revoked states */
 	/* NFSv4.1 client implementation id: */
 	struct xdr_netobj	cl_nii_domain;
 	struct xdr_netobj	cl_nii_name;
@@ -752,6 +754,14 @@ static inline void get_nfs4_file(struct nfs4_file *fi)
 }
 struct nfsd_file *find_any_file(struct nfs4_file *f);
 
+#ifdef CONFIG_NFSD_V4
+void nfsd4_revoke_states(struct net *net, struct super_block *sb);
+#else
+static inline void nfsd4_revoke_states(struct net *net, struct super_block *sb)
+{
+}
+#endif
+
 /* grace period management */
 void nfsd4_end_grace(struct nfsd_net *nn);
 
diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
index 54d82e14d724..d0274d203f99 100644
--- a/fs/nfsd/trace.h
+++ b/fs/nfsd/trace.h
@@ -651,7 +651,8 @@ DEFINE_STATESEQID_EVENT(open_confirm);
 #define show_stid_status(x)						\
 	__print_flags(x, "|",						\
 		{ NFS4_STID_CLOSED,		"CLOSED" },		\
-		{ NFS4_STID_REVOKED,		"REVOKED" })		\
+		{ NFS4_STID_REVOKED,		"REVOKED" },		\
+		{ NFS4_STID_ADMIN_REVOKED,	"ADMIN_REVOKED" })
 
 DECLARE_EVENT_CLASS(nfsd_stid_class,
 	TP_PROTO(
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (3 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17 11:27   ` Jeff Layton
  2023-11-17  2:18 ` [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed NeilBrown
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

Change the "show" functions to show some content even if a file cannot
be found.
This is primarily useful for debugging - to ensure states are being
removed eventually.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 81 +++++++++++++++++++++++----------------------
 1 file changed, 41 insertions(+), 40 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a2b3320a6ba8..8debd148840f 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2680,17 +2680,10 @@ static int nfs4_show_open(struct seq_file *s, struct nfs4_stid *st)
 	struct nfs4_stateowner *oo;
 	unsigned int access, deny;
 
-	if (st->sc_type != NFS4_OPEN_STID && st->sc_type != NFS4_LOCK_STID)
-		return 0; /* XXX: or SEQ_SKIP? */
 	ols = openlockstateid(st);
 	oo = ols->st_stateowner;
 	nf = st->sc_file;
 
-	spin_lock(&nf->fi_lock);
-	file = find_any_file_locked(nf);
-	if (!file)
-		goto out;
-
 	seq_printf(s, "- ");
 	nfs4_show_stateid(s, &st->sc_stateid);
 	seq_printf(s, ": { type: open, ");
@@ -2705,14 +2698,19 @@ static int nfs4_show_open(struct seq_file *s, struct nfs4_stid *st)
 		deny & NFS4_SHARE_ACCESS_READ ? "r" : "-",
 		deny & NFS4_SHARE_ACCESS_WRITE ? "w" : "-");
 
-	nfs4_show_superblock(s, file);
-	seq_printf(s, ", ");
-	nfs4_show_fname(s, file);
-	seq_printf(s, ", ");
+	spin_lock(&nf->fi_lock);
+	file = find_any_file_locked(nf);
+	if (file) {
+		nfs4_show_superblock(s, file);
+		seq_puts(s, ", ");
+		nfs4_show_fname(s, file);
+		seq_puts(s, ", ");
+	}
+	spin_unlock(&nf->fi_lock);
 	nfs4_show_owner(s, oo);
+	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
+		seq_puts(s, ", admin-revoked");
 	seq_printf(s, " }\n");
-out:
-	spin_unlock(&nf->fi_lock);
 	return 0;
 }
 
@@ -2726,30 +2724,31 @@ static int nfs4_show_lock(struct seq_file *s, struct nfs4_stid *st)
 	ols = openlockstateid(st);
 	oo = ols->st_stateowner;
 	nf = st->sc_file;
-	spin_lock(&nf->fi_lock);
-	file = find_any_file_locked(nf);
-	if (!file)
-		goto out;
 
 	seq_printf(s, "- ");
 	nfs4_show_stateid(s, &st->sc_stateid);
 	seq_printf(s, ": { type: lock, ");
 
-	/*
-	 * Note: a lock stateid isn't really the same thing as a lock,
-	 * it's the locking state held by one owner on a file, and there
-	 * may be multiple (or no) lock ranges associated with it.
-	 * (Same for the matter is true of open stateids.)
-	 */
+	spin_lock(&nf->fi_lock);
+	file = find_any_file_locked(nf);
+	if (file) {
+		/*
+		 * Note: a lock stateid isn't really the same thing as a lock,
+		 * it's the locking state held by one owner on a file, and there
+		 * may be multiple (or no) lock ranges associated with it.
+		 * (Same for the matter is true of open stateids.)
+		 */
 
-	nfs4_show_superblock(s, file);
-	/* XXX: open stateid? */
-	seq_printf(s, ", ");
-	nfs4_show_fname(s, file);
-	seq_printf(s, ", ");
+		nfs4_show_superblock(s, file);
+		/* XXX: open stateid? */
+		seq_puts(s, ", ");
+		nfs4_show_fname(s, file);
+		seq_puts(s, ", ");
+	}
 	nfs4_show_owner(s, oo);
+	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
+		seq_puts(s, ", admin-revoked");
 	seq_printf(s, " }\n");
-out:
 	spin_unlock(&nf->fi_lock);
 	return 0;
 }
@@ -2762,27 +2761,29 @@ static int nfs4_show_deleg(struct seq_file *s, struct nfs4_stid *st)
 
 	ds = delegstateid(st);
 	nf = st->sc_file;
-	spin_lock(&nf->fi_lock);
-	file = nf->fi_deleg_file;
-	if (!file)
-		goto out;
 
 	seq_printf(s, "- ");
 	nfs4_show_stateid(s, &st->sc_stateid);
 	seq_printf(s, ": { type: deleg, ");
 
 	/* Kinda dead code as long as we only support read delegs: */
-	seq_printf(s, "access: %s, ",
-		ds->dl_type == NFS4_OPEN_DELEGATE_READ ? "r" : "w");
+	seq_printf(s, "access: %s",
+		   ds->dl_type == NFS4_OPEN_DELEGATE_READ ? "r" : "w");
 
 	/* XXX: lease time, whether it's being recalled. */
 
-	nfs4_show_superblock(s, file);
-	seq_printf(s, ", ");
-	nfs4_show_fname(s, file);
-	seq_printf(s, " }\n");
-out:
+	spin_lock(&nf->fi_lock);
+	file = nf->fi_deleg_file;
+	if (file) {
+		seq_puts(s, ", ");
+		nfs4_show_superblock(s, file);
+		seq_puts(s, ", ");
+		nfs4_show_fname(s, file);
+	}
 	spin_unlock(&nf->fi_lock);
+	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
+		seq_puts(s, ", admin-revoked");
+	seq_puts(s, " }\n");
 	return 0;
 }
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (4 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17 11:34   ` Jeff Layton
  2023-11-17 19:58   ` Chuck Lever
  2023-11-17  2:18 ` [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed NeilBrown
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

For NFSv4.1 and later the client easily discovers if there is any
admin-revoked state and will then find and explicitly free it.

For NFSv4.0 there is no such mechanism.  The client can only find that
state is admin-revoked if it tries to use that state, and there is no
way for it to explicitly free the state.  So the server must hold on to
the stateid (at least) for an indefinite amount of time.  A
RELEASE_LOCKOWNER request might justify forgetting some of these
stateids, as would the whole clients lease lapsing, but these are not
reliable.

This patch takes two approaches.

Whenever a client uses an revoked stateid, that stateid is then
discarded and will not be recognised again.  This might confuse a client
which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
all, but should mostly work.  Hopefully one error will lead to other
resources being closed (e.g.  process exits), which will result in more
stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.

Also, any admin-revoked stateids that have been that way for more than
one lease time are periodically revoke.

No actual freeing of state happens in this patch.  That will come in
future patches which handle the different sorts of revoked state.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/netns.h     |  4 ++
 fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index ec49b200b797..02f8fa095b0f 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -197,6 +197,10 @@ struct nfsd_net {
 	atomic_t		nfsd_courtesy_clients;
 	struct shrinker		nfsd_client_shrinker;
 	struct work_struct	nfsd_shrinker_work;
+
+	/* last time an admin-revoke happened for NFSv4.0 */
+	time64_t		nfs40_last_revoke;
+
 };
 
 /* Simple check to find out if a given net was properly initialized */
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8debd148840f..8a1b8376ff08 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1724,6 +1724,14 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 				}
 				nfs4_put_stid(stid);
 				spin_lock(&nn->client_lock);
+				if (clp->cl_minorversion == 0)
+					/* Allow cleanup after a lease period.
+					 * store_release ensures cleanup will
+					 * see any newly revoked states if it
+					 * sees the time updated.
+					 */
+					nn->nfs40_last_revoke =
+						ktime_get_boottime_seconds();
 				goto retry;
 			}
 		}
@@ -4650,6 +4658,39 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 	return ret;
 }
 
+static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
+{
+	struct nfs4_client *cl = s->sc_client;
+
+	switch (s->sc_type) {
+	default:
+		spin_unlock(&cl->cl_lock);
+	}
+}
+
+static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
+				    stateid_t *stid)
+{
+	/* NFSv4.0 has no way for the client to tell the server
+	 * that it can forget an admin-revoked stateid.
+	 * So we keep it around until the first time that the
+	 * client uses it, and drop it the first time
+	 * nfserr_admin_revoked is returned.
+	 * For v4.1 and later we wait until explicitly told
+	 * to free the stateid.
+	 */
+	if (cl->cl_minorversion == 0) {
+		struct nfs4_stid *st;
+
+		spin_lock(&cl->cl_lock);
+		st = find_stateid_locked(cl, stid);
+		if (st)
+			nfsd_drop_revoked_stid(st);
+		else
+			spin_unlock(&cl->cl_lock);
+	}
+}
+
 static __be32
 nfsd4_verify_open_stid(struct nfs4_stid *s)
 {
@@ -4672,6 +4713,10 @@ nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
 
 	mutex_lock_nested(&stp->st_mutex, LOCK_STATEID_MUTEX);
 	ret = nfsd4_verify_open_stid(&stp->st_stid);
+	if (ret == nfserr_admin_revoked)
+		nfs40_drop_revoked_stid(stp->st_stid.sc_client,
+					&stp->st_stid.sc_stateid);
+
 	if (ret != nfs_ok)
 		mutex_unlock(&stp->st_mutex);
 	return ret;
@@ -5255,6 +5300,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
 	}
 	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
 		nfs4_put_stid(&deleg->dl_stid);
+		nfs40_drop_revoked_stid(cl, &open->op_delegate_stateid);
 		status = nfserr_deleg_revoked;
 		goto out;
 	}
@@ -6253,6 +6299,43 @@ nfs4_process_client_reaplist(struct list_head *reaplist)
 	}
 }
 
+static void nfs40_clean_admin_revoked(struct nfsd_net *nn,
+				      struct laundry_time *lt)
+{
+	struct nfs4_client *clp;
+
+	spin_lock(&nn->client_lock);
+	if (nn->nfs40_last_revoke == 0 ||
+	    nn->nfs40_last_revoke > lt->cutoff) {
+		spin_unlock(&nn->client_lock);
+		return;
+	}
+	nn->nfs40_last_revoke = 0;
+
+retry:
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		unsigned long id, tmp;
+		struct nfs4_stid *stid;
+
+		if (atomic_read(&clp->cl_admin_revoked) == 0)
+			continue;
+
+		spin_lock(&clp->cl_lock);
+		idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
+			if (stid->sc_status & NFS4_STID_ADMIN_REVOKED) {
+				refcount_inc(&stid->sc_count);
+				spin_unlock(&nn->client_lock);
+				/* this function drops ->cl_lock */
+				nfsd_drop_revoked_stid(stid);
+				nfs4_put_stid(stid);
+				spin_lock(&nn->client_lock);
+				goto retry;
+			}
+		spin_unlock(&clp->cl_lock);
+	}
+	spin_unlock(&nn->client_lock);
+}
+
 static time64_t
 nfs4_laundromat(struct nfsd_net *nn)
 {
@@ -6286,6 +6369,8 @@ nfs4_laundromat(struct nfsd_net *nn)
 	nfs4_get_client_reaplist(nn, &reaplist, &lt);
 	nfs4_process_client_reaplist(&reaplist);
 
+	nfs40_clean_admin_revoked(nn, &lt);
+
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
@@ -6504,6 +6589,9 @@ static __be32 nfsd4_stid_check_stateid_generation(stateid_t *in, struct nfs4_sti
 	if (ret == nfs_ok)
 		ret = check_stateid_generation(in, &s->sc_stateid, has_session);
 	spin_unlock(&s->sc_lock);
+	if (ret == nfserr_admin_revoked)
+		nfs40_drop_revoked_stid(s->sc_client,
+					&s->sc_stateid);
 	return ret;
 }
 
@@ -6548,6 +6636,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 	}
 out_unlock:
 	spin_unlock(&cl->cl_lock);
+	if (status == nfserr_admin_revoked)
+		nfs40_drop_revoked_stid(cl, stateid);
 	return status;
 }
 
@@ -6594,6 +6684,7 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		return nfserr_deleg_revoked;
 	}
 	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
+		nfs40_drop_revoked_stid(cstate->clp, stateid);
 		nfs4_put_stid(stid);
 		return nfserr_admin_revoked;
 	}
@@ -6886,6 +6977,11 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	s = find_stateid_locked(cl, stateid);
 	if (!s || s->sc_status & NFS4_STID_CLOSED)
 		goto out_unlock;
+	if (s->sc_status & NFS4_STID_ADMIN_REVOKED) {
+		nfsd_drop_revoked_stid(s);
+		ret = nfs_ok;
+		goto out;
+	}
 	spin_lock(&s->sc_lock);
 	switch (s->sc_type) {
 	case NFS4_DELEG_STID:
@@ -6912,7 +7008,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		spin_unlock(&cl->cl_lock);
 		ret = nfsd4_free_lock_stateid(stateid, s);
 		goto out;
-	/* Default falls through and returns nfserr_bad_stateid */
 	}
 	spin_unlock(&s->sc_lock);
 out_unlock:
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (5 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17  2:18 ` [PATCH 8/9] nfsd: allow open " NeilBrown
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

Revoking state through 'unlock_filesystem' now revokes any lock states
found.  When the stateids are then freed by the client, the revoked
stateids will be cleaned up correctly.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 8a1b8376ff08..9fec3bf69d31 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1708,7 +1708,7 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 	unsigned int idhashval;
 	unsigned int sc_types;
 
-	sc_types = 0;
+	sc_types = NFS4_LOCK_STID;
 
 	spin_lock(&nn->client_lock);
 	for (idhashval = 0; idhashval < CLIENT_HASH_MASK; idhashval++) {
@@ -1719,8 +1719,36 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 			struct nfs4_stid *stid = find_one_sb_stid(clp, sb,
 								  sc_types);
 			if (stid) {
+				struct nfs4_ol_stateid *stp;
+
 				spin_unlock(&nn->client_lock);
 				switch (stid->sc_type) {
+				case NFS4_LOCK_STID:
+					stp = openlockstateid(stid);
+					mutex_lock_nested(&stp->st_mutex,
+							  LOCK_STATEID_MUTEX);
+					spin_lock(&clp->cl_lock);
+					if (stid->sc_status == 0) {
+						struct nfs4_lockowner *lo =
+							lockowner(stp->st_stateowner);
+						struct nfsd_file *nf;
+
+						stid->sc_status |=
+							NFS4_STID_ADMIN_REVOKED;
+						atomic_inc(&clp->cl_admin_revoked);
+						spin_unlock(&clp->cl_lock);
+						nf = find_any_file(stp->st_stid.sc_file);
+						if (nf) {
+							get_file(nf->nf_file);
+							filp_close(nf->nf_file,
+								   (fl_owner_t)lo);
+							nfsd_file_put(nf);
+						}
+						release_all_access(stp);
+					} else
+						spin_unlock(&clp->cl_lock);
+					mutex_unlock(&stp->st_mutex);
+					break;
 				}
 				nfs4_put_stid(stid);
 				spin_lock(&nn->client_lock);
@@ -4661,8 +4689,18 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
 {
 	struct nfs4_client *cl = s->sc_client;
+	LIST_HEAD(reaplist);
+	struct nfs4_ol_stateid *stp;
+	bool unhashed;
 
 	switch (s->sc_type) {
+	case NFS4_LOCK_STID:
+		stp = openlockstateid(s);
+		unhashed = unhash_lock_stateid(stp);
+		spin_unlock(&cl->cl_lock);
+		if (unhashed)
+			nfs4_put_stid(s);
+		break;
 	default:
 		spin_unlock(&cl->cl_lock);
 	}
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 8/9] nfsd: allow open state ids to be revoked and then freed
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (6 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17  2:18 ` [PATCH 9/9] nfsd: allow delegation " NeilBrown
  2023-11-17 12:25 ` [PATCH 0/9 v3] support admin-revocation of v4 state Jeff Layton
  9 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

Revoking state through 'unlock_filesystem' now revokes any open states
found.  When the stateids are then freed by the client, the revoked
stateids will be cleaned up correctly.

Possibly the related lock states should be revoked too, but a
subsequent patch will do that for all lock state on the superblock.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9fec3bf69d31..dd454f50383e 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1708,7 +1708,7 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 	unsigned int idhashval;
 	unsigned int sc_types;
 
-	sc_types = NFS4_LOCK_STID;
+	sc_types = NFS4_OPEN_STID | NFS4_LOCK_STID;
 
 	spin_lock(&nn->client_lock);
 	for (idhashval = 0; idhashval < CLIENT_HASH_MASK; idhashval++) {
@@ -1723,6 +1723,22 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 
 				spin_unlock(&nn->client_lock);
 				switch (stid->sc_type) {
+				case NFS4_OPEN_STID:
+					stp = openlockstateid(stid);
+					mutex_lock_nested(&stp->st_mutex,
+							  OPEN_STATEID_MUTEX);
+
+					spin_lock(&clp->cl_lock);
+					if (stid->sc_status == 0) {
+						stid->sc_status |=
+							NFS4_STID_ADMIN_REVOKED;
+						atomic_inc(&clp->cl_admin_revoked);
+						spin_unlock(&clp->cl_lock);
+						release_all_access(stp);
+					} else
+						spin_unlock(&clp->cl_lock);
+					mutex_unlock(&stp->st_mutex);
+					break;
 				case NFS4_LOCK_STID:
 					stp = openlockstateid(stid);
 					mutex_lock_nested(&stp->st_mutex,
@@ -4694,6 +4710,13 @@ static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
 	bool unhashed;
 
 	switch (s->sc_type) {
+	case NFS4_OPEN_STID:
+		stp = openlockstateid(s);
+		if (unhash_open_stateid(stp, &reaplist))
+			put_ol_stateid_locked(stp, &reaplist);
+		spin_unlock(&cl->cl_lock);
+		free_ol_stateid_reaplist(&reaplist);
+		break;
 	case NFS4_LOCK_STID:
 		stp = openlockstateid(s);
 		unhashed = unhash_lock_stateid(stp);
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 9/9] nfsd: allow delegation state ids to be revoked and then freed
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (7 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 8/9] nfsd: allow open " NeilBrown
@ 2023-11-17  2:18 ` NeilBrown
  2023-11-17 12:25 ` [PATCH 0/9 v3] support admin-revocation of v4 state Jeff Layton
  9 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-17  2:18 UTC (permalink / raw)
  To: Chuck Lever, Jeff Layton
  Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

Revoking state through 'unlock_filesystem' now revokes any delegation
states found.  When the stateids are then freed by the client, the
revoked stateids will be cleaned up correctly.

As there is already support for revoking delegations, we build on that
for admin-revoking.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/nfs4state.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index dd454f50383e..feceaf4bf638 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1340,9 +1340,12 @@ unhash_delegation_locked(struct nfs4_delegation *dp, unsigned int statusmask)
 	if (!delegation_hashed(dp))
 		return false;
 
-	if (dp->dl_stid.sc_client->cl_minorversion == 0)
+	if (statusmask == NFS4_STID_REVOKED &&
+	    dp->dl_stid.sc_client->cl_minorversion == 0)
 		statusmask = NFS4_STID_CLOSED;
 	dp->dl_stid.sc_status |= statusmask;
+	if (statusmask & NFS4_STID_ADMIN_REVOKED)
+		atomic_inc(&dp->dl_stid.sc_client->cl_admin_revoked);
 
 	/* Ensure that deleg break won't try to requeue it */
 	++dp->dl_time;
@@ -1373,7 +1376,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
 
 	trace_nfsd_stid_revoke(&dp->dl_stid);
 
-	if (dp->dl_stid.sc_status & NFS4_STID_REVOKED) {
+	if (dp->dl_stid.sc_status &
+	    (NFS4_STID_REVOKED | NFS4_STID_ADMIN_REVOKED)) {
 		spin_lock(&clp->cl_lock);
 		refcount_inc(&dp->dl_stid.sc_count);
 		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
@@ -1708,7 +1712,7 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 	unsigned int idhashval;
 	unsigned int sc_types;
 
-	sc_types = NFS4_OPEN_STID | NFS4_LOCK_STID;
+	sc_types = NFS4_OPEN_STID | NFS4_LOCK_STID | NFS4_DELEG_STID;
 
 	spin_lock(&nn->client_lock);
 	for (idhashval = 0; idhashval < CLIENT_HASH_MASK; idhashval++) {
@@ -1720,6 +1724,7 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 								  sc_types);
 			if (stid) {
 				struct nfs4_ol_stateid *stp;
+				struct nfs4_delegation *dp;
 
 				spin_unlock(&nn->client_lock);
 				switch (stid->sc_type) {
@@ -1765,6 +1770,18 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 						spin_unlock(&clp->cl_lock);
 					mutex_unlock(&stp->st_mutex);
 					break;
+				case NFS4_DELEG_STID:
+					dp = delegstateid(stid);
+					spin_lock(&state_lock);
+					if (!unhash_delegation_locked(
+						    dp, NFS4_STID_ADMIN_REVOKED))
+						dp = NULL;
+					spin_unlock(&state_lock);
+					if (dp) {
+						list_del_init(&dp->dl_recall_lru);
+						revoke_delegation(dp);
+					}
+					break;
 				}
 				nfs4_put_stid(stid);
 				spin_lock(&nn->client_lock);
@@ -4707,6 +4724,7 @@ static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
 	struct nfs4_client *cl = s->sc_client;
 	LIST_HEAD(reaplist);
 	struct nfs4_ol_stateid *stp;
+	struct nfs4_delegation *dp;
 	bool unhashed;
 
 	switch (s->sc_type) {
@@ -4724,6 +4742,12 @@ static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
 		if (unhashed)
 			nfs4_put_stid(s);
 		break;
+	case NFS4_DELEG_STID:
+		dp = delegstateid(s);
+		list_del_init(&dp->dl_recall_lru);
+		spin_unlock(&cl->cl_lock);
+		nfs4_put_stid(s);
+		break;
 	default:
 		spin_unlock(&cl->cl_lock);
 	}
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-17  2:18 ` [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked() NeilBrown
@ 2023-11-17 10:59   ` Jeff Layton
  2023-11-19 21:37     ` NeilBrown
  2023-11-17 15:04   ` Chuck Lever
  1 sibling, 1 reply; 31+ messages in thread
From: Jeff Layton @ 2023-11-17 10:59 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever; +Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> The protocol for creating a new state in nfsd is to allocated the state
> leaving it largely uninitialised, add that state to the ->cl_stateids
> idr so as to reserve a state id, then complete initialisation of the
> state and only set ->sc_type to non-zero once the state is fully
> initialised.
> 
> If a state is found in the idr with ->sc_type == 0, it is ignored.
> The ->cl_lock list is used to avoid races - it is held while checking
> sc_type during lookup, and held when a non-zero value is stored in
> ->sc_type.
> 
> ... except... hash_delegation_locked() finalises the initialisation of a
> delegation state, but does NOT hold ->cl_lock.
> 
> So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
> and so ensures there are no races (which are extremely unlikely in any
> case).
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/nfs4state.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 65fd5510323a..6368788a7d4e 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
>  
>  	lockdep_assert_held(&state_lock);
>  	lockdep_assert_held(&fp->fi_lock);
> +	lockdep_assert_held(&clp->cl_lock);
>  
>  	if (nfs4_delegation_exists(clp, fp))
>  		return -EAGAIN;
> @@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  		goto out_unlock;
>  
>  	spin_lock(&state_lock);
> +	spin_lock(&clp->cl_lock);
>  	spin_lock(&fp->fi_lock);
>  	if (fp->fi_had_conflict)
>  		status = -EAGAIN;
>  	else
>  		status = hash_delegation_locked(dp, fp);
>  	spin_unlock(&fp->fi_lock);
> +	spin_unlock(&clp->cl_lock);
>  	spin_unlock(&state_lock);
>  
>  	if (status)

I know it's (supposedly) an unlikely race, but should we send this to
stable?

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states
  2023-11-17  2:18 ` [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states NeilBrown
@ 2023-11-17 11:27   ` Jeff Layton
  2023-11-19 22:36     ` NeilBrown
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff Layton @ 2023-11-17 11:27 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever; +Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> Change the "show" functions to show some content even if a file cannot
> be found.
> This is primarily useful for debugging - to ensure states are being
> removed eventually.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/nfs4state.c | 81 +++++++++++++++++++++++----------------------
>  1 file changed, 41 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index a2b3320a6ba8..8debd148840f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2680,17 +2680,10 @@ static int nfs4_show_open(struct seq_file *s, struct nfs4_stid *st)
>  	struct nfs4_stateowner *oo;
>  	unsigned int access, deny;
>  
> -	if (st->sc_type != NFS4_OPEN_STID && st->sc_type != NFS4_LOCK_STID)
> -		return 0; /* XXX: or SEQ_SKIP? */
>  	ols = openlockstateid(st);
>  	oo = ols->st_stateowner;
>  	nf = st->sc_file;
>  
> -	spin_lock(&nf->fi_lock);
> -	file = find_any_file_locked(nf);
> -	if (!file)
> -		goto out;
> -
>  	seq_printf(s, "- ");
>  	nfs4_show_stateid(s, &st->sc_stateid);
>  	seq_printf(s, ": { type: open, ");
> @@ -2705,14 +2698,19 @@ static int nfs4_show_open(struct seq_file *s, struct nfs4_stid *st)
>  		deny & NFS4_SHARE_ACCESS_READ ? "r" : "-",
>  		deny & NFS4_SHARE_ACCESS_WRITE ? "w" : "-");
>  
> -	nfs4_show_superblock(s, file);
> -	seq_printf(s, ", ");
> -	nfs4_show_fname(s, file);
> -	seq_printf(s, ", ");
> +	spin_lock(&nf->fi_lock);
> +	file = find_any_file_locked(nf);
> +	if (file) {
> +		nfs4_show_superblock(s, file);
> +		seq_puts(s, ", ");
> +		nfs4_show_fname(s, file);
> +		seq_puts(s, ", ");
> +	}
> +	spin_unlock(&nf->fi_lock);
>  	nfs4_show_owner(s, oo);
> +	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
> +		seq_puts(s, ", admin-revoked");
>  	seq_printf(s, " }\n");
> -out:
> -	spin_unlock(&nf->fi_lock);
>  	return 0;
>  }
>  
> @@ -2726,30 +2724,31 @@ static int nfs4_show_lock(struct seq_file *s, struct nfs4_stid *st)
>  	ols = openlockstateid(st);
>  	oo = ols->st_stateowner;
>  	nf = st->sc_file;
> -	spin_lock(&nf->fi_lock);
> -	file = find_any_file_locked(nf);
> -	if (!file)
> -		goto out;
>  
>  	seq_printf(s, "- ");
>  	nfs4_show_stateid(s, &st->sc_stateid);
>  	seq_printf(s, ": { type: lock, ");
>  
> -	/*
> -	 * Note: a lock stateid isn't really the same thing as a lock,
> -	 * it's the locking state held by one owner on a file, and there
> -	 * may be multiple (or no) lock ranges associated with it.
> -	 * (Same for the matter is true of open stateids.)
> -	 */
> +	spin_lock(&nf->fi_lock);
> +	file = find_any_file_locked(nf);
> +	if (file) {
> +		/*
> +		 * Note: a lock stateid isn't really the same thing as a lock,
> +		 * it's the locking state held by one owner on a file, and there
> +		 * may be multiple (or no) lock ranges associated with it.
> +		 * (Same for the matter is true of open stateids.)
> +		 */
>  
> -	nfs4_show_superblock(s, file);
> -	/* XXX: open stateid? */
> -	seq_printf(s, ", ");
> -	nfs4_show_fname(s, file);
> -	seq_printf(s, ", ");
> +		nfs4_show_superblock(s, file);
> +		/* XXX: open stateid? */
> +		seq_puts(s, ", ");
> +		nfs4_show_fname(s, file);
> +		seq_puts(s, ", ");
> +	}
>  	nfs4_show_owner(s, oo);
> +	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
> +		seq_puts(s, ", admin-revoked");
>  	seq_printf(s, " }\n");
> -out:
>  	spin_unlock(&nf->fi_lock);
>  	return 0;
>  }
> @@ -2762,27 +2761,29 @@ static int nfs4_show_deleg(struct seq_file *s, struct nfs4_stid *st)
>  
>  	ds = delegstateid(st);
>  	nf = st->sc_file;
> -	spin_lock(&nf->fi_lock);
> -	file = nf->fi_deleg_file;
> -	if (!file)
> -		goto out;
>  
>  	seq_printf(s, "- ");
>  	nfs4_show_stateid(s, &st->sc_stateid);
>  	seq_printf(s, ": { type: deleg, ");
>  
>  	/* Kinda dead code as long as we only support read delegs: */

nit: You can probably remove the above comment since we now support
write delegations.

> -	seq_printf(s, "access: %s, ",
> -		ds->dl_type == NFS4_OPEN_DELEGATE_READ ? "r" : "w");
> +	seq_printf(s, "access: %s",
> +		   ds->dl_type == NFS4_OPEN_DELEGATE_READ ? "r" : "w");
>  
>  	/* XXX: lease time, whether it's being recalled. */
>  
> -	nfs4_show_superblock(s, file);
> -	seq_printf(s, ", ");
> -	nfs4_show_fname(s, file);
> -	seq_printf(s, " }\n");
> -out:
> +	spin_lock(&nf->fi_lock);
> +	file = nf->fi_deleg_file;
> +	if (file) {
> +		seq_puts(s, ", ");
> +		nfs4_show_superblock(s, file);
> +		seq_puts(s, ", ");
> +		nfs4_show_fname(s, file);
> +	}
>  	spin_unlock(&nf->fi_lock);
> +	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
> +		seq_puts(s, ", admin-revoked");
> +	seq_puts(s, " }\n");
>  	return 0;
>  }
>  

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
  2023-11-17  2:18 ` [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed NeilBrown
@ 2023-11-17 11:34   ` Jeff Layton
  2023-11-19 22:40     ` NeilBrown
  2023-11-17 19:58   ` Chuck Lever
  1 sibling, 1 reply; 31+ messages in thread
From: Jeff Layton @ 2023-11-17 11:34 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever; +Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> For NFSv4.1 and later the client easily discovers if there is any
> admin-revoked state and will then find and explicitly free it.
> 
> For NFSv4.0 there is no such mechanism.  The client can only find that
> state is admin-revoked if it tries to use that state, and there is no
> way for it to explicitly free the state.  So the server must hold on to
> the stateid (at least) for an indefinite amount of time.  A
> RELEASE_LOCKOWNER request might justify forgetting some of these
> stateids, as would the whole clients lease lapsing, but these are not
> reliable.
> 
> This patch takes two approaches.
> 
> Whenever a client uses an revoked stateid, that stateid is then
> discarded and will not be recognised again.  This might confuse a client
> which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> all, but should mostly work.  Hopefully one error will lead to other
> resources being closed (e.g.  process exits), which will result in more
> stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> 
> Also, any admin-revoked stateids that have been that way for more than
> one lease time are periodically revoke.
> 

Why a single lease period?

Is that a long enough time for v4.0? Given that the protocol has no
mechanism to detect revoked state, I have to wonder if a single lease
period is enough time for the client to detect the problem?

> No actual freeing of state happens in this patch.  That will come in
> future patches which handle the different sorts of revoked state.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/netns.h     |  4 ++
>  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 100 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index ec49b200b797..02f8fa095b0f 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -197,6 +197,10 @@ struct nfsd_net {
>  	atomic_t		nfsd_courtesy_clients;
>  	struct shrinker		nfsd_client_shrinker;
>  	struct work_struct	nfsd_shrinker_work;
> +
> +	/* last time an admin-revoke happened for NFSv4.0 */
> +	time64_t		nfs40_last_revoke;
> +
>  };
>  
>  /* Simple check to find out if a given net was properly initialized */
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 8debd148840f..8a1b8376ff08 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1724,6 +1724,14 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
>  				}
>  				nfs4_put_stid(stid);
>  				spin_lock(&nn->client_lock);
> +				if (clp->cl_minorversion == 0)
> +					/* Allow cleanup after a lease period.
> +					 * store_release ensures cleanup will
> +					 * see any newly revoked states if it
> +					 * sees the time updated.
> +					 */
> +					nn->nfs40_last_revoke =
> +						ktime_get_boottime_seconds();
>  				goto retry;
>  			}
>  		}
> @@ -4650,6 +4658,39 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
>  	return ret;
>  }
>  
> +static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
> +{
> +	struct nfs4_client *cl = s->sc_client;
> +
> +	switch (s->sc_type) {
> +	default:
> +		spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
> +static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
> +				    stateid_t *stid)
> +{
> +	/* NFSv4.0 has no way for the client to tell the server
> +	 * that it can forget an admin-revoked stateid.
> +	 * So we keep it around until the first time that the
> +	 * client uses it, and drop it the first time
> +	 * nfserr_admin_revoked is returned.
> +	 * For v4.1 and later we wait until explicitly told
> +	 * to free the stateid.
> +	 */
> +	if (cl->cl_minorversion == 0) {
> +		struct nfs4_stid *st;
> +
> +		spin_lock(&cl->cl_lock);
> +		st = find_stateid_locked(cl, stid);
> +		if (st)
> +			nfsd_drop_revoked_stid(st);
> +		else
> +			spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
>  static __be32
>  nfsd4_verify_open_stid(struct nfs4_stid *s)
>  {
> @@ -4672,6 +4713,10 @@ nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
>  
>  	mutex_lock_nested(&stp->st_mutex, LOCK_STATEID_MUTEX);
>  	ret = nfsd4_verify_open_stid(&stp->st_stid);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(stp->st_stid.sc_client,
> +					&stp->st_stid.sc_stateid);
> +
>  	if (ret != nfs_ok)
>  		mutex_unlock(&stp->st_mutex);
>  	return ret;
> @@ -5255,6 +5300,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
>  	}
>  	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
>  		nfs4_put_stid(&deleg->dl_stid);
> +		nfs40_drop_revoked_stid(cl, &open->op_delegate_stateid);
>  		status = nfserr_deleg_revoked;
>  		goto out;
>  	}
> @@ -6253,6 +6299,43 @@ nfs4_process_client_reaplist(struct list_head *reaplist)
>  	}
>  }
>  
> +static void nfs40_clean_admin_revoked(struct nfsd_net *nn,
> +				      struct laundry_time *lt)
> +{
> +	struct nfs4_client *clp;
> +
> +	spin_lock(&nn->client_lock);
> +	if (nn->nfs40_last_revoke == 0 ||
> +	    nn->nfs40_last_revoke > lt->cutoff) {
> +		spin_unlock(&nn->client_lock);
> +		return;
> +	}
> +	nn->nfs40_last_revoke = 0;
> +
> +retry:
> +	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
> +		unsigned long id, tmp;
> +		struct nfs4_stid *stid;
> +
> +		if (atomic_read(&clp->cl_admin_revoked) == 0)
> +			continue;
> +
> +		spin_lock(&clp->cl_lock);
> +		idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
> +			if (stid->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +				refcount_inc(&stid->sc_count);
> +				spin_unlock(&nn->client_lock);
> +				/* this function drops ->cl_lock */
> +				nfsd_drop_revoked_stid(stid);
> +				nfs4_put_stid(stid);
> +				spin_lock(&nn->client_lock);
> +				goto retry;
> +			}
> +		spin_unlock(&clp->cl_lock);
> +	}
> +	spin_unlock(&nn->client_lock);
> +}
> +
>  static time64_t
>  nfs4_laundromat(struct nfsd_net *nn)
>  {
> @@ -6286,6 +6369,8 @@ nfs4_laundromat(struct nfsd_net *nn)
>  	nfs4_get_client_reaplist(nn, &reaplist, &lt);
>  	nfs4_process_client_reaplist(&reaplist);
>  
> +	nfs40_clean_admin_revoked(nn, &lt);
> +
>  	spin_lock(&state_lock);
>  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> @@ -6504,6 +6589,9 @@ static __be32 nfsd4_stid_check_stateid_generation(stateid_t *in, struct nfs4_sti
>  	if (ret == nfs_ok)
>  		ret = check_stateid_generation(in, &s->sc_stateid, has_session);
>  	spin_unlock(&s->sc_lock);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(s->sc_client,
> +					&s->sc_stateid);
>  	return ret;
>  }
>  
> @@ -6548,6 +6636,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  	}
>  out_unlock:
>  	spin_unlock(&cl->cl_lock);
> +	if (status == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(cl, stateid);
>  	return status;
>  }
>  
> @@ -6594,6 +6684,7 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
>  		return nfserr_deleg_revoked;
>  	}
>  	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
> +		nfs40_drop_revoked_stid(cstate->clp, stateid);
>  		nfs4_put_stid(stid);
>  		return nfserr_admin_revoked;
>  	}
> @@ -6886,6 +6977,11 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	s = find_stateid_locked(cl, stateid);
>  	if (!s || s->sc_status & NFS4_STID_CLOSED)
>  		goto out_unlock;
> +	if (s->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +		nfsd_drop_revoked_stid(s);
> +		ret = nfs_ok;
> +		goto out;
> +	}
>  	spin_lock(&s->sc_lock);
>  	switch (s->sc_type) {
>  	case NFS4_DELEG_STID:
> @@ -6912,7 +7008,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		spin_unlock(&cl->cl_lock);
>  		ret = nfsd4_free_lock_stateid(stateid, s);
>  		goto out;
> -	/* Default falls through and returns nfserr_bad_stateid */
>  	}
>  	spin_unlock(&s->sc_lock);
>  out_unlock:

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
  2023-11-17  2:18 ` [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked() NeilBrown
@ 2023-11-17 11:41   ` Jeff Layton
  2023-11-17 19:43     ` Chuck Lever
  0 siblings, 1 reply; 31+ messages in thread
From: Jeff Layton @ 2023-11-17 11:41 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever; +Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in
> purpose.
> REVOKED is used for NFSv4.1 states which have been revoked because the
> lease has expired.  CLOSED is used in other cases.
> The difference has two practical effects.
> 1/ REVOKED states are on the ->cl_revoked list
> 2/ REVOKED states result in nfserr_deleg_revoked from
>    nfsd4_verify_open_stid() asnd nfsd4_validate_stateid while
>    CLOSED states result in nfserr_bad_stid.
> 
> Currently a state that is being revoked is first set to "CLOSED" in
> unhash_delegation_locked(), then possibly to "REVOKED" in
> revoke_delegation(), at which point it is added to the cl_revoked list.
> 
> It is possible that a stateid test could see the CLOSED state
> which really should be REVOKED, and so return the wrong error code.  So
> it is safest to remove this window of inconsistency.
> 
> With this patch, unhash_delegation_locked() always set the state
> correctly, and revoke_delegation() no longer changes the state.
> 
> Also remove a redundant test on minorversion when
> NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion
> is non-zero.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/nfs4state.c | 20 ++++++++++----------
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 6368788a7d4e..7469583382fb 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1334,7 +1334,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
>  }
>  
>  static bool
> -unhash_delegation_locked(struct nfs4_delegation *dp)
> +unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
>  {
>  	struct nfs4_file *fp = dp->dl_stid.sc_file;
>  
> @@ -1343,7 +1343,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp)
>  	if (!delegation_hashed(dp))
>  		return false;
>  
> -	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> +	if (dp->dl_stid.sc_client->cl_minorversion == 0)
> +		type = NFS4_CLOSED_DELEG_STID;
> +	dp->dl_stid.sc_type = type;
>  	/* Ensure that deleg break won't try to requeue it */
>  	++dp->dl_time;
>  	spin_lock(&fp->fi_lock);
> @@ -1359,7 +1361,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
>  	bool unhashed;
>  
>  	spin_lock(&state_lock);
> -	unhashed = unhash_delegation_locked(dp);
> +	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
>  	spin_unlock(&state_lock);
>  	if (unhashed)
>  		destroy_unhashed_deleg(dp);
> @@ -1373,9 +1375,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
>  
>  	trace_nfsd_stid_revoke(&dp->dl_stid);
>  
> -	if (clp->cl_minorversion) {
> +	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
>  		spin_lock(&clp->cl_lock);
> -		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
>  		refcount_inc(&dp->dl_stid.sc_count);
>  		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
>  		spin_unlock(&clp->cl_lock);
> @@ -2234,7 +2235,7 @@ __destroy_client(struct nfs4_client *clp)
>  	spin_lock(&state_lock);
>  	while (!list_empty(&clp->cl_delegations)) {
>  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> -		WARN_ON(!unhash_delegation_locked(dp));
> +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
>  		list_add(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);
> @@ -5197,8 +5198,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
>  		goto out;
>  	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
>  		nfs4_put_stid(&deleg->dl_stid);
> -		if (cl->cl_minorversion)
> -			status = nfserr_deleg_revoked;
> +		status = nfserr_deleg_revoked;
>  		goto out;
>  	}
>  	flags = share_access_to_flags(open->op_share_access);
> @@ -6235,7 +6235,7 @@ nfs4_laundromat(struct nfsd_net *nn)
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
>  		if (!state_expired(&lt, dp->dl_time))
>  			break;
> -		WARN_ON(!unhash_delegation_locked(dp));
> +		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
>  		list_add(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);
> @@ -8350,7 +8350,7 @@ nfs4_state_shutdown_net(struct net *net)
>  	spin_lock(&state_lock);
>  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> -		WARN_ON(!unhash_delegation_locked(dp));
> +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
>  		list_add(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);

Same question here. Should this go to stable? I guess the race is not
generally fatal...

Reviewed-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/9 v3] support admin-revocation of v4 state
  2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
                   ` (8 preceding siblings ...)
  2023-11-17  2:18 ` [PATCH 9/9] nfsd: allow delegation " NeilBrown
@ 2023-11-17 12:25 ` Jeff Layton
  2023-11-19 23:21   ` NeilBrown
  9 siblings, 1 reply; 31+ messages in thread
From: Jeff Layton @ 2023-11-17 12:25 UTC (permalink / raw)
  To: NeilBrown, Chuck Lever; +Cc: linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> This set adds a prequel series to my previous posting which addresses
> some locking problems around ->sc_type and splits that field into
> to separate fields: sc_type and sc_status.
> The recovation code is modified to accomodate these changed.
> 
> Thanks
> NeilBrown
> 
>  [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
>  [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
>  [PATCH 3/9] nfsd: split sc_status out of sc_type
>  [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state
>  [PATCH 5/9] nfsd: allow admin-revoked state to appear in
>  [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
>  [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed
>  [PATCH 8/9] nfsd: allow open state ids to be revoked and then freed
>  [PATCH 9/9] nfsd: allow delegation state ids to be revoked and then
> 

Nice set. I like this overall. One (other) question: do we need to add
handling for revoking layout stateids as well?

I'll plan to pull this into my local branch for some testing over the
weekend.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-17  2:18 ` [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked() NeilBrown
  2023-11-17 10:59   ` Jeff Layton
@ 2023-11-17 15:04   ` Chuck Lever
  2023-11-19 22:07     ` NeilBrown
  1 sibling, 1 reply; 31+ messages in thread
From: Chuck Lever @ 2023-11-17 15:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, Nov 17, 2023 at 01:18:47PM +1100, NeilBrown wrote:
> The protocol for creating a new state in nfsd is to allocated the state
> leaving it largely uninitialised, add that state to the ->cl_stateids
> idr so as to reserve a state id, then complete initialisation of the
> state and only set ->sc_type to non-zero once the state is fully
> initialised.
> 
> If a state is found in the idr with ->sc_type == 0, it is ignored.
> The ->cl_lock list is used to avoid races - it is held while checking

s/->cl_lock list/->cl_lock lock


> sc_type during lookup,

In particular, find_stateid_locked(), but yet, not in nfs4_find_file()

Can you help me understand why it's not needed in the latter case?


> and held when a non-zero value is stored in  ->sc_type.

I see a few additional spots where an sc_type value is set and
cl_lock is not held:

init_open_stateid
nfsd4_process_open2


> ... except... hash_delegation_locked() finalises the initialisation of a
> delegation state, but does NOT hold ->cl_lock.
> 
> So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
> and so ensures there are no races (which are extremely unlikely in any
> case).

I would have expected that cl_lock should be taken first. Can the
patch description provide some rationale for the lock ordering
you chose?

Jeff asks in another email whether this fix should get copied to
stable. Since the race is unlikely, I'm inclined to wait for an
explicit problem report.


> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/nfs4state.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 65fd5510323a..6368788a7d4e 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
>  
>  	lockdep_assert_held(&state_lock);
>  	lockdep_assert_held(&fp->fi_lock);
> +	lockdep_assert_held(&clp->cl_lock);
>  
>  	if (nfs4_delegation_exists(clp, fp))
>  		return -EAGAIN;
> @@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
>  		goto out_unlock;
>  
>  	spin_lock(&state_lock);
> +	spin_lock(&clp->cl_lock);
>  	spin_lock(&fp->fi_lock);
>  	if (fp->fi_had_conflict)
>  		status = -EAGAIN;
>  	else
>  		status = hash_delegation_locked(dp, fp);
>  	spin_unlock(&fp->fi_lock);
> +	spin_unlock(&clp->cl_lock);
>  	spin_unlock(&state_lock);
>  
>  	if (status)
> -- 
> 2.42.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
  2023-11-17 11:41   ` Jeff Layton
@ 2023-11-17 19:43     ` Chuck Lever
  2023-11-19 22:23       ` NeilBrown
  0 siblings, 1 reply; 31+ messages in thread
From: Chuck Lever @ 2023-11-17 19:43 UTC (permalink / raw)
  To: Jeff Layton; +Cc: NeilBrown, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, Nov 17, 2023 at 06:41:37AM -0500, Jeff Layton wrote:
> On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in
> > purpose.
> > REVOKED is used for NFSv4.1 states which have been revoked because the
> > lease has expired.  CLOSED is used in other cases.
> > The difference has two practical effects.
> > 1/ REVOKED states are on the ->cl_revoked list
> > 2/ REVOKED states result in nfserr_deleg_revoked from
> >    nfsd4_verify_open_stid() asnd nfsd4_validate_stateid while
> >    CLOSED states result in nfserr_bad_stid.
> > 
> > Currently a state that is being revoked is first set to "CLOSED" in
> > unhash_delegation_locked(), then possibly to "REVOKED" in
> > revoke_delegation(), at which point it is added to the cl_revoked list.
> > 
> > It is possible that a stateid test could see the CLOSED state
> > which really should be REVOKED, and so return the wrong error code.  So
> > it is safest to remove this window of inconsistency.
> > 
> > With this patch, unhash_delegation_locked() always set the state
> > correctly, and revoke_delegation() no longer changes the state.
> > 
> > Also remove a redundant test on minorversion when
> > NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion
> > is non-zero.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/nfs4state.c | 20 ++++++++++----------
> >  1 file changed, 10 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 6368788a7d4e..7469583382fb 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1334,7 +1334,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
> >  }
> >  
> >  static bool
> > -unhash_delegation_locked(struct nfs4_delegation *dp)
> > +unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)

unsigned short type ?


> >  {
> >  	struct nfs4_file *fp = dp->dl_stid.sc_file;
> >  
> > @@ -1343,7 +1343,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp)
> >  	if (!delegation_hashed(dp))
> >  		return false;
> >  
> > -	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> > +	if (dp->dl_stid.sc_client->cl_minorversion == 0)
> > +		type = NFS4_CLOSED_DELEG_STID;
> > +	dp->dl_stid.sc_type = type;
> >  	/* Ensure that deleg break won't try to requeue it */
> >  	++dp->dl_time;
> >  	spin_lock(&fp->fi_lock);
> > @@ -1359,7 +1361,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
> >  	bool unhashed;
> >  
> >  	spin_lock(&state_lock);
> > -	unhashed = unhash_delegation_locked(dp);
> > +	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
> >  	spin_unlock(&state_lock);
> >  	if (unhashed)
> >  		destroy_unhashed_deleg(dp);
> > @@ -1373,9 +1375,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
> >  
> >  	trace_nfsd_stid_revoke(&dp->dl_stid);
> >  
> > -	if (clp->cl_minorversion) {
> > +	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> >  		spin_lock(&clp->cl_lock);
> > -		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
> >  		refcount_inc(&dp->dl_stid.sc_count);
> >  		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> >  		spin_unlock(&clp->cl_lock);
> > @@ -2234,7 +2235,7 @@ __destroy_client(struct nfs4_client *clp)
> >  	spin_lock(&state_lock);
> >  	while (!list_empty(&clp->cl_delegations)) {
> >  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> > -		WARN_ON(!unhash_delegation_locked(dp));
> > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> >  		list_add(&dp->dl_recall_lru, &reaplist);
> >  	}
> >  	spin_unlock(&state_lock);
> > @@ -5197,8 +5198,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
> >  		goto out;
> >  	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> >  		nfs4_put_stid(&deleg->dl_stid);
> > -		if (cl->cl_minorversion)
> > -			status = nfserr_deleg_revoked;
> > +		status = nfserr_deleg_revoked;
> >  		goto out;
> >  	}
> >  	flags = share_access_to_flags(open->op_share_access);
> > @@ -6235,7 +6235,7 @@ nfs4_laundromat(struct nfsd_net *nn)
> >  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> >  		if (!state_expired(&lt, dp->dl_time))
> >  			break;
> > -		WARN_ON(!unhash_delegation_locked(dp));
> > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
> >  		list_add(&dp->dl_recall_lru, &reaplist);
> >  	}
> >  	spin_unlock(&state_lock);
> > @@ -8350,7 +8350,7 @@ nfs4_state_shutdown_net(struct net *net)
> >  	spin_lock(&state_lock);
> >  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
> >  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> > -		WARN_ON(!unhash_delegation_locked(dp));
> > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));

Neil, what say we get rid of these WARN_ONs?


> >  		list_add(&dp->dl_recall_lru, &reaplist);
> >  	}
> >  	spin_unlock(&state_lock);
> 
> Same question here. Should this go to stable? I guess the race is not
> generally fatal...

Again, there's no existing bug report, so no urgency to get this
backported. I would see these changes soak in upstream rather than
pull them into stable quickly only to discover another bug has been
introduced.

We don't have a failing test or a data corruption risk, as far as
I can tell.


> Reviewed-by: Jeff Layton <jlayton@kernel.org>

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/9] nfsd: split sc_status out of sc_type
  2023-11-17  2:18 ` [PATCH 3/9] nfsd: split sc_status out of sc_type NeilBrown
@ 2023-11-17 19:52   ` Chuck Lever
  2023-11-19 22:35     ` NeilBrown
  0 siblings, 1 reply; 31+ messages in thread
From: Chuck Lever @ 2023-11-17 19:52 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, Nov 17, 2023 at 01:18:49PM +1100, NeilBrown wrote:
> sc_type identifies the type of a state - open, lock, deleg, layout - and
> also the status of a state - closed or revoked.
> 
> This is a bit untidy and could get worse when "admin-revoked" states are
> added.  So clean it up.
> 
> With this patch, the type is now all that is stored in sc_type.  This is
> zero when the state is first added to ->cl_stateids (causing it to be
> ignored), and is then set appropriately once it is fully initialised.
> It is set under ->cl_lock to ensure atomicity w.r.t lookup.  It is now
> never cleared.
> 
> sc_type is still a bit-set even though at most one bit is set.  This allows
> lookup functions to be given a bitmap of acceptable types.
> 
> cl_type is now an unsigned short rather than char.  There is no value in

s/cl_type/sc_type


> restricting to just 8 bits.  When passed to a function or stored in local
> variable, "unsigned int" is used.

This seems confusing, and I'd prefer not to introduce implicit type
conversions where they aren't truly necessary. Why not use "unsigned
short" or even "unsigned int" everywhere?


> The status is stored in a separate short named "cl_status".  It has two
> flags: NFS4_STID_CLOSED and NFS4_STID_REVOKED.

s/cl_status/sc_status


> CLOSED combines NFS4_CLOSED_STID, NFS4_CLOSED_DELEG_STID, and is used
> for LOCK_STID and LAYOUT_STID instead of setting the sc_type to zero.
> These flags are only ever set, never cleared.
> For deleg stateids they are set under the global state_lock.
> For open and lock stateids they are set under ->cl_lock.
> For layout stateids they are set under ->ls_lock
> 
> nfs4_unhash_stid() has been removed, and we never set sc_type = 0.  This
> was only used for LOCK and LAYOUT stids and they now use
> NFS4_STID_CLOSED.
> 
> Also TRACE_DEFINE_NUM() calls for the various STD #define have been
> removed because these things are not enums, and so that call is
> incorrect.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/nfs4layouts.c |   6 +-
>  fs/nfsd/nfs4state.c   | 165 +++++++++++++++++++++---------------------
>  fs/nfsd/state.h       |  40 +++++++---
>  fs/nfsd/trace.h       |  23 +++---
>  4 files changed, 122 insertions(+), 112 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
> index 5e8096bc5eaa..eb8bb05a8df5 100644
> --- a/fs/nfsd/nfs4layouts.c
> +++ b/fs/nfsd/nfs4layouts.c
> @@ -269,13 +269,13 @@ nfsd4_preprocess_layout_stateid(struct svc_rqst *rqstp,
>  {
>  	struct nfs4_layout_stateid *ls;
>  	struct nfs4_stid *stid;
> -	unsigned char typemask = NFS4_LAYOUT_STID;
> +	unsigned int typemask = NFS4_LAYOUT_STID;
>  	__be32 status;
>  
>  	if (create)
>  		typemask |= (NFS4_OPEN_STID | NFS4_LOCK_STID | NFS4_DELEG_STID);
>  
> -	status = nfsd4_lookup_stateid(cstate, stateid, typemask, &stid,
> +	status = nfsd4_lookup_stateid(cstate, stateid, typemask, 0, &stid,
>  			net_generic(SVC_NET(rqstp), nfsd_net_id));
>  	if (status)
>  		goto out;
> @@ -518,7 +518,7 @@ nfsd4_return_file_layouts(struct svc_rqst *rqstp,
>  		lrp->lrs_present = true;
>  	} else {
>  		trace_nfsd_layoutstate_unhash(&ls->ls_stid.sc_stateid);
> -		nfs4_unhash_stid(&ls->ls_stid);
> +		ls->ls_stid.sc_status |= NFS4_STID_CLOSED;
>  		lrp->lrs_present = false;
>  	}
>  	spin_unlock(&ls->ls_lock);
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 7469583382fb..b1fdd0ee0f5c 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1265,11 +1265,6 @@ static void destroy_unhashed_deleg(struct nfs4_delegation *dp)
>  	nfs4_put_stid(&dp->dl_stid);
>  }
>  
> -void nfs4_unhash_stid(struct nfs4_stid *s)
> -{
> -	s->sc_type = 0;
> -}
> -
>  /**
>   * nfs4_delegation_exists - Discover if this delegation already exists
>   * @clp:     a pointer to the nfs4_client we're granting a delegation to
> @@ -1334,7 +1329,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
>  }
>  
>  static bool
> -unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
> +unhash_delegation_locked(struct nfs4_delegation *dp, unsigned int statusmask)
>  {
>  	struct nfs4_file *fp = dp->dl_stid.sc_file;
>  
> @@ -1344,8 +1339,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
>  		return false;
>  
>  	if (dp->dl_stid.sc_client->cl_minorversion == 0)
> -		type = NFS4_CLOSED_DELEG_STID;
> -	dp->dl_stid.sc_type = type;
> +		statusmask = NFS4_STID_CLOSED;
> +	dp->dl_stid.sc_status |= statusmask;
> +
>  	/* Ensure that deleg break won't try to requeue it */
>  	++dp->dl_time;
>  	spin_lock(&fp->fi_lock);
> @@ -1361,7 +1357,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
>  	bool unhashed;
>  
>  	spin_lock(&state_lock);
> -	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
> +	unhashed = unhash_delegation_locked(dp, NFS4_STID_CLOSED);
>  	spin_unlock(&state_lock);
>  	if (unhashed)
>  		destroy_unhashed_deleg(dp);
> @@ -1375,7 +1371,7 @@ static void revoke_delegation(struct nfs4_delegation *dp)
>  
>  	trace_nfsd_stid_revoke(&dp->dl_stid);
>  
> -	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> +	if (dp->dl_stid.sc_status & NFS4_STID_REVOKED) {
>  		spin_lock(&clp->cl_lock);
>  		refcount_inc(&dp->dl_stid.sc_count);
>  		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> @@ -1384,8 +1380,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
>  	destroy_unhashed_deleg(dp);
>  }
>  
> -/* 
> - * SETCLIENTID state 
> +/*
> + * SETCLIENTID state
>   */
>  
>  static unsigned int clientid_hashval(u32 id)
> @@ -1548,7 +1544,7 @@ static bool unhash_lock_stateid(struct nfs4_ol_stateid *stp)
>  	if (!unhash_ol_stateid(stp))
>  		return false;
>  	list_del_init(&stp->st_locks);
> -	nfs4_unhash_stid(&stp->st_stid);
> +	stp->st_stid.sc_status |= NFS4_STID_CLOSED;
>  	return true;
>  }
>  
> @@ -1627,6 +1623,7 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
>  	LIST_HEAD(reaplist);
>  
>  	spin_lock(&stp->st_stid.sc_client->cl_lock);
> +	stp->st_stid.sc_status |= NFS4_STID_CLOSED;
>  	if (unhash_open_stateid(stp, &reaplist))
>  		put_ol_stateid_locked(stp, &reaplist);
>  	spin_unlock(&stp->st_stid.sc_client->cl_lock);
> @@ -2235,7 +2232,7 @@ __destroy_client(struct nfs4_client *clp)
>  	spin_lock(&state_lock);
>  	while (!list_empty(&clp->cl_delegations)) {
>  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> -		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> +		WARN_ON(!unhash_delegation_locked(dp, NFS4_STID_CLOSED));
>  		list_add(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);
> @@ -2467,14 +2464,16 @@ find_stateid_locked(struct nfs4_client *cl, stateid_t *t)
>  }
>  
>  static struct nfs4_stid *
> -find_stateid_by_type(struct nfs4_client *cl, stateid_t *t, char typemask)
> +find_stateid_by_type(struct nfs4_client *cl, stateid_t *t,
> +		     unsigned int typemask, unsigned int ok_states)
>  {
>  	struct nfs4_stid *s;
>  
>  	spin_lock(&cl->cl_lock);
>  	s = find_stateid_locked(cl, t);
>  	if (s != NULL) {
> -		if (typemask & s->sc_type)
> +		if ((s->sc_status & ~ok_states) == 0 &&
> +		    (typemask & s->sc_type))
>  			refcount_inc(&s->sc_count);
>  		else
>  			s = NULL;
> @@ -4584,7 +4583,8 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
>  			continue;
>  		if (local->st_stateowner != &oo->oo_owner)
>  			continue;
> -		if (local->st_stid.sc_type == NFS4_OPEN_STID) {
> +		if (local->st_stid.sc_type == NFS4_OPEN_STID &&
> +		    !local->st_stid.sc_status) {
>  			ret = local;
>  			refcount_inc(&ret->st_stid.sc_count);
>  			break;
> @@ -4598,17 +4598,10 @@ nfsd4_verify_open_stid(struct nfs4_stid *s)
>  {
>  	__be32 ret = nfs_ok;
>  
> -	switch (s->sc_type) {
> -	default:
> -		break;
> -	case 0:
> -	case NFS4_CLOSED_STID:
> -	case NFS4_CLOSED_DELEG_STID:
> -		ret = nfserr_bad_stateid;
> -		break;
> -	case NFS4_REVOKED_DELEG_STID:
> +	if (s->sc_status & NFS4_STID_REVOKED)
>  		ret = nfserr_deleg_revoked;
> -	}
> +	else if (s->sc_status & NFS4_STID_CLOSED)
> +		ret = nfserr_bad_stateid;
>  	return ret;
>  }
>  
> @@ -4921,9 +4914,9 @@ static int nfsd4_cb_recall_done(struct nfsd4_callback *cb,
>  
>  	trace_nfsd_cb_recall_done(&dp->dl_stid.sc_stateid, task);
>  
> -	if (dp->dl_stid.sc_type == NFS4_CLOSED_DELEG_STID ||
> -	    dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID)
> -	        return 1;
> +	if (dp->dl_stid.sc_status)
> +		/* CLOSED or REVOKED */
> +		return 1;
>  
>  	switch (task->tk_status) {
>  	case 0:
> @@ -5168,12 +5161,12 @@ static int share_access_to_flags(u32 share_access)
>  	return share_access == NFS4_SHARE_ACCESS_READ ? RD_STATE : WR_STATE;
>  }
>  
> -static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl, stateid_t *s)
> +static struct nfs4_delegation *find_deleg_stateid(struct nfs4_client *cl,
> +						  stateid_t *s)
>  {
>  	struct nfs4_stid *ret;
>  
> -	ret = find_stateid_by_type(cl, s,
> -				NFS4_DELEG_STID|NFS4_REVOKED_DELEG_STID);
> +	ret = find_stateid_by_type(cl, s, NFS4_DELEG_STID, NFS4_STID_REVOKED);
>  	if (!ret)
>  		return NULL;
>  	return delegstateid(ret);
> @@ -5196,7 +5189,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
>  	deleg = find_deleg_stateid(cl, &open->op_delegate_stateid);
>  	if (deleg == NULL)
>  		goto out;
> -	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> +	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
>  		nfs4_put_stid(&deleg->dl_stid);
>  		status = nfserr_deleg_revoked;
>  		goto out;
> @@ -5843,7 +5836,6 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
>  	} else {
>  		status = nfs4_get_vfs_file(rqstp, fp, current_fh, stp, open, true);
>  		if (status) {
> -			stp->st_stid.sc_type = NFS4_CLOSED_STID;
>  			release_open_stateid(stp);
>  			mutex_unlock(&stp->st_mutex);
>  			goto out;
> @@ -6235,7 +6227,7 @@ nfs4_laundromat(struct nfsd_net *nn)
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
>  		if (!state_expired(&lt, dp->dl_time))
>  			break;
> -		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
> +		WARN_ON(!unhash_delegation_locked(dp, NFS4_STID_REVOKED));
>  		list_add(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);
> @@ -6474,22 +6466,20 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  	status = nfsd4_stid_check_stateid_generation(stateid, s, 1);
>  	if (status)
>  		goto out_unlock;
> +	status = nfsd4_verify_open_stid(s);
> +	if (status)
> +		goto out_unlock;
> +
>  	switch (s->sc_type) {
>  	case NFS4_DELEG_STID:
>  		status = nfs_ok;
>  		break;
> -	case NFS4_REVOKED_DELEG_STID:
> -		status = nfserr_deleg_revoked;
> -		break;
>  	case NFS4_OPEN_STID:
>  	case NFS4_LOCK_STID:
>  		status = nfsd4_check_openowner_confirmed(openlockstateid(s));
>  		break;
>  	default:
>  		printk("unknown stateid type %x\n", s->sc_type);
> -		fallthrough;
> -	case NFS4_CLOSED_STID:
> -	case NFS4_CLOSED_DELEG_STID:
>  		status = nfserr_bad_stateid;
>  	}
>  out_unlock:
> @@ -6499,7 +6489,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  
>  __be32
>  nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
> -		     stateid_t *stateid, unsigned char typemask,
> +		     stateid_t *stateid,
> +		     unsigned int typemask, unsigned int statusmask,
>  		     struct nfs4_stid **s, struct nfsd_net *nn)
>  {
>  	__be32 status;
> @@ -6510,10 +6501,13 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
>  	 *  only return revoked delegations if explicitly asked.
>  	 *  otherwise we report revoked or bad_stateid status.
>  	 */
> -	if (typemask & NFS4_REVOKED_DELEG_STID)
> +	if (statusmask & NFS4_STID_REVOKED)
>  		return_revoked = true;
> -	else if (typemask & NFS4_DELEG_STID)
> -		typemask |= NFS4_REVOKED_DELEG_STID;
> +	if (typemask & NFS4_DELEG_STID)
> +		/* Always allow REVOKED for DELEG so we can
> +		 * retturn the appropriate error.
> +		 */
> +		statusmask |= NFS4_STID_REVOKED;
>  
>  	if (ZERO_STATEID(stateid) || ONE_STATEID(stateid) ||
>  		CLOSE_STATEID(stateid))
> @@ -6526,14 +6520,12 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
>  	}
>  	if (status)
>  		return status;
> -	stid = find_stateid_by_type(cstate->clp, stateid, typemask);
> +	stid = find_stateid_by_type(cstate->clp, stateid, typemask, statusmask);
>  	if (!stid)
>  		return nfserr_bad_stateid;
> -	if ((stid->sc_type == NFS4_REVOKED_DELEG_STID) && !return_revoked) {
> +	if ((stid->sc_status & NFS4_STID_REVOKED) && !return_revoked) {
>  		nfs4_put_stid(stid);
> -		if (cstate->minorversion)
> -			return nfserr_deleg_revoked;
> -		return nfserr_bad_stateid;
> +		return nfserr_deleg_revoked;
>  	}
>  	*s = stid;
>  	return nfs_ok;
> @@ -6544,7 +6536,7 @@ nfs4_find_file(struct nfs4_stid *s, int flags)
>  {
>  	struct nfsd_file *ret = NULL;
>  
> -	if (!s)
> +	if (!s || s->sc_status)
>  		return NULL;
>  
>  	switch (s->sc_type) {
> @@ -6667,7 +6659,8 @@ static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
>  		goto out;
>  
>  	*stid = find_stateid_by_type(found, &cps->cp_p_stateid,
> -			NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID);
> +				     NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
> +				     0);
>  	if (*stid)
>  		status = nfs_ok;
>  	else
> @@ -6725,7 +6718,7 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
>  
>  	status = nfsd4_lookup_stateid(cstate, stateid,
>  				NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
> -				&s, nn);
> +				0, &s, nn);
>  	if (status == nfserr_bad_stateid)
>  		status = find_cpntf_state(nn, stateid, &s);
>  	if (status)
> @@ -6743,9 +6736,6 @@ nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
>  	case NFS4_LOCK_STID:
>  		status = nfs4_check_olstateid(openlockstateid(s), flags);
>  		break;
> -	default:
> -		status = nfserr_bad_stateid;
> -		break;
>  	}
>  	if (status)
>  		goto out;
> @@ -6824,11 +6814,20 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  
>  	spin_lock(&cl->cl_lock);
>  	s = find_stateid_locked(cl, stateid);
> -	if (!s)
> +	if (!s || s->sc_status & NFS4_STID_CLOSED)
>  		goto out_unlock;
>  	spin_lock(&s->sc_lock);
>  	switch (s->sc_type) {
>  	case NFS4_DELEG_STID:
> +		if (s->sc_status & NFS4_STID_REVOKED) {
> +			spin_unlock(&s->sc_lock);
> +			dp = delegstateid(s);
> +			list_del_init(&dp->dl_recall_lru);
> +			spin_unlock(&cl->cl_lock);
> +			nfs4_put_stid(s);
> +			ret = nfs_ok;
> +			goto out;
> +		}
>  		ret = nfserr_locks_held;
>  		break;
>  	case NFS4_OPEN_STID:
> @@ -6843,14 +6842,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		spin_unlock(&cl->cl_lock);
>  		ret = nfsd4_free_lock_stateid(stateid, s);
>  		goto out;
> -	case NFS4_REVOKED_DELEG_STID:
> -		spin_unlock(&s->sc_lock);
> -		dp = delegstateid(s);
> -		list_del_init(&dp->dl_recall_lru);
> -		spin_unlock(&cl->cl_lock);
> -		nfs4_put_stid(s);
> -		ret = nfs_ok;
> -		goto out;
>  	/* Default falls through and returns nfserr_bad_stateid */
>  	}
>  	spin_unlock(&s->sc_lock);
> @@ -6893,6 +6884,7 @@ static __be32 nfs4_seqid_op_checks(struct nfsd4_compound_state *cstate, stateid_
>   * @seqid: seqid (provided by client)
>   * @stateid: stateid (provided by client)
>   * @typemask: mask of allowable types for this operation
> + * @statusmask: mask of allowed states: 0 or STID_CLOSED
>   * @stpp: return pointer for the stateid found
>   * @nn: net namespace for request
>   *
> @@ -6902,7 +6894,8 @@ static __be32 nfs4_seqid_op_checks(struct nfsd4_compound_state *cstate, stateid_
>   */
>  static __be32
>  nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
> -			 stateid_t *stateid, char typemask,
> +			 stateid_t *stateid,
> +			 unsigned int typemask, unsigned int statusmask,
>  			 struct nfs4_ol_stateid **stpp,
>  			 struct nfsd_net *nn)
>  {
> @@ -6913,7 +6906,8 @@ nfs4_preprocess_seqid_op(struct nfsd4_compound_state *cstate, u32 seqid,
>  	trace_nfsd_preprocess(seqid, stateid);
>  
>  	*stpp = NULL;
> -	status = nfsd4_lookup_stateid(cstate, stateid, typemask, &s, nn);
> +	status = nfsd4_lookup_stateid(cstate, stateid,
> +				      typemask, statusmask, &s, nn);
>  	if (status)
>  		return status;
>  	stp = openlockstateid(s);
> @@ -6935,7 +6929,7 @@ static __be32 nfs4_preprocess_confirmed_seqid_op(struct nfsd4_compound_state *cs
>  	struct nfs4_ol_stateid *stp;
>  
>  	status = nfs4_preprocess_seqid_op(cstate, seqid, stateid,
> -						NFS4_OPEN_STID, &stp, nn);
> +					  NFS4_OPEN_STID, 0, &stp, nn);
>  	if (status)
>  		return status;
>  	oo = openowner(stp->st_stateowner);
> @@ -6966,8 +6960,8 @@ nfsd4_open_confirm(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		return status;
>  
>  	status = nfs4_preprocess_seqid_op(cstate,
> -					oc->oc_seqid, &oc->oc_req_stateid,
> -					NFS4_OPEN_STID, &stp, nn);
> +					  oc->oc_seqid, &oc->oc_req_stateid,
> +					  NFS4_OPEN_STID, 0, &stp, nn);
>  	if (status)
>  		goto out;
>  	oo = openowner(stp->st_stateowner);
> @@ -7097,18 +7091,20 @@ nfsd4_close(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	struct net *net = SVC_NET(rqstp);
>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>  
> -	dprintk("NFSD: nfsd4_close on file %pd\n", 
> +	dprintk("NFSD: nfsd4_close on file %pd\n",
>  			cstate->current_fh.fh_dentry);
>  
>  	status = nfs4_preprocess_seqid_op(cstate, close->cl_seqid,
> -					&close->cl_stateid,
> -					NFS4_OPEN_STID|NFS4_CLOSED_STID,
> -					&stp, nn);
> +					  &close->cl_stateid,
> +					  NFS4_OPEN_STID, NFS4_STID_CLOSED,
> +					  &stp, nn);
>  	nfsd4_bump_seqid(cstate, status);
>  	if (status)
> -		goto out; 
> +		goto out;
>  
> -	stp->st_stid.sc_type = NFS4_CLOSED_STID;
> +	spin_lock(&stp->st_stid.sc_client->cl_lock);
> +	stp->st_stid.sc_status |= NFS4_STID_CLOSED;
> +	spin_unlock(&stp->st_stid.sc_client->cl_lock);
>  
>  	/*
>  	 * Technically we don't _really_ have to increment or copy it, since
> @@ -7150,7 +7146,7 @@ nfsd4_delegreturn(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	if ((status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0)))
>  		return status;
>  
> -	status = nfsd4_lookup_stateid(cstate, stateid, NFS4_DELEG_STID, &s, nn);
> +	status = nfsd4_lookup_stateid(cstate, stateid, NFS4_DELEG_STID, 0, &s, nn);
>  	if (status)
>  		goto out;
>  	dp = delegstateid(s);
> @@ -7604,9 +7600,10 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  							&lock_stp, &new);
>  	} else {
>  		status = nfs4_preprocess_seqid_op(cstate,
> -				       lock->lk_old_lock_seqid,
> -				       &lock->lk_old_lock_stateid,
> -				       NFS4_LOCK_STID, &lock_stp, nn);
> +						  lock->lk_old_lock_seqid,
> +						  &lock->lk_old_lock_stateid,
> +						  NFS4_LOCK_STID, 0, &lock_stp,
> +						  nn);
>  	}
>  	if (status)
>  		goto out;
> @@ -7919,8 +7916,8 @@ nfsd4_locku(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		 return nfserr_inval;
>  
>  	status = nfs4_preprocess_seqid_op(cstate, locku->lu_seqid,
> -					&locku->lu_stateid, NFS4_LOCK_STID,
> -					&stp, nn);
> +					  &locku->lu_stateid, NFS4_LOCK_STID, 0,
> +					  &stp, nn);
>  	if (status)
>  		goto out;
>  	nf = find_any_file(stp->st_stid.sc_file);
> @@ -8350,7 +8347,7 @@ nfs4_state_shutdown_net(struct net *net)
>  	spin_lock(&state_lock);
>  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> -		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> +		WARN_ON(!unhash_delegation_locked(dp, NFS4_STID_CLOSED));
>  		list_add(&dp->dl_recall_lru, &reaplist);
>  	}
>  	spin_unlock(&state_lock);
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index f96eaa8e9413..22f9eb3986d2 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -88,17 +88,33 @@ struct nfsd4_callback_ops {
>   */
>  struct nfs4_stid {
>  	refcount_t		sc_count;
> -#define NFS4_OPEN_STID 1
> -#define NFS4_LOCK_STID 2
> -#define NFS4_DELEG_STID 4
> -/* For an open stateid kept around *only* to process close replays: */
> -#define NFS4_CLOSED_STID 8
> +
> +	/* A new stateid is added to the cl_stateids idr early before it
> +	 * is fully initialised.  Its sc_type is then zero.  After
> +	 * initialisation the sc_type it set under cl_lock, and then
> +	 * never changes.
> +	 */
> +#define NFS4_OPEN_STID		BIT(0)
> +#define NFS4_LOCK_STID		BIT(1)
> +#define NFS4_DELEG_STID		BIT(2)
> +#define NFS4_LAYOUT_STID	BIT(3)
> +	unsigned short		sc_type;
> +
> +/* state_lock protects sc_status for delegation stateids.
> + * ->cl_lock protects sc_status for open and lock stateids.
> + * ->st_mutex also protect sc_status for open stateids.
> + * ->ls_lock protects sc_status for layout stateids.
> + */
> +/*
> + * For an open stateid kept around *only* to process close replays.
> + * For deleg stateid, kept in idr until last reference is dropped.
> + */
> +#define NFS4_STID_CLOSED	BIT(0)
>  /* For a deleg stateid kept around only to process free_stateid's: */
> -#define NFS4_REVOKED_DELEG_STID 16
> -#define NFS4_CLOSED_DELEG_STID 32
> -#define NFS4_LAYOUT_STID 64
> +#define NFS4_STID_REVOKED	BIT(1)
> +	unsigned short		sc_status;
> +
>  	struct list_head	sc_cp_list;
> -	unsigned char		sc_type;
>  	stateid_t		sc_stateid;
>  	spinlock_t		sc_lock;
>  	struct nfs4_client	*sc_client;
> @@ -694,15 +710,15 @@ extern __be32 nfs4_preprocess_stateid_op(struct svc_rqst *rqstp,
>  		stateid_t *stateid, int flags, struct nfsd_file **filp,
>  		struct nfs4_stid **cstid);
>  __be32 nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
> -		     stateid_t *stateid, unsigned char typemask,
> -		     struct nfs4_stid **s, struct nfsd_net *nn);
> +			    stateid_t *stateid, unsigned int typemask,
> +			    unsigned int statusmask,
> +			    struct nfs4_stid **s, struct nfsd_net *nn);
>  struct nfs4_stid *nfs4_alloc_stid(struct nfs4_client *cl, struct kmem_cache *slab,
>  				  void (*sc_free)(struct nfs4_stid *));
>  int nfs4_init_copy_state(struct nfsd_net *nn, struct nfsd4_copy *copy);
>  void nfs4_free_copy_state(struct nfsd4_copy *copy);
>  struct nfs4_cpntf_state *nfs4_alloc_init_cpntf_state(struct nfsd_net *nn,
>  			struct nfs4_stid *p_stid);
> -void nfs4_unhash_stid(struct nfs4_stid *s);
>  void nfs4_put_stid(struct nfs4_stid *s);
>  void nfs4_inc_and_copy_stateid(stateid_t *dst, struct nfs4_stid *stid);
>  void nfs4_remove_reclaim_record(struct nfs4_client_reclaim *, struct nfsd_net *);
> diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h
> index fbc0ccb40424..54d82e14d724 100644
> --- a/fs/nfsd/trace.h
> +++ b/fs/nfsd/trace.h
> @@ -641,24 +641,18 @@ DEFINE_EVENT(nfsd_stateseqid_class, nfsd_##name, \
>  DEFINE_STATESEQID_EVENT(preprocess);
>  DEFINE_STATESEQID_EVENT(open_confirm);
>  
> -TRACE_DEFINE_ENUM(NFS4_OPEN_STID);
> -TRACE_DEFINE_ENUM(NFS4_LOCK_STID);
> -TRACE_DEFINE_ENUM(NFS4_DELEG_STID);
> -TRACE_DEFINE_ENUM(NFS4_CLOSED_STID);
> -TRACE_DEFINE_ENUM(NFS4_REVOKED_DELEG_STID);
> -TRACE_DEFINE_ENUM(NFS4_CLOSED_DELEG_STID);
> -TRACE_DEFINE_ENUM(NFS4_LAYOUT_STID);
> -
>  #define show_stid_type(x)						\
>  	__print_flags(x, "|",						\
>  		{ NFS4_OPEN_STID,		"OPEN" },		\
>  		{ NFS4_LOCK_STID,		"LOCK" },		\
>  		{ NFS4_DELEG_STID,		"DELEG" },		\
> -		{ NFS4_CLOSED_STID,		"CLOSED" },		\
> -		{ NFS4_REVOKED_DELEG_STID,	"REVOKED" },		\
> -		{ NFS4_CLOSED_DELEG_STID,	"CLOSED_DELEG" },	\
>  		{ NFS4_LAYOUT_STID,		"LAYOUT" })
>  
> +#define show_stid_status(x)						\
> +	__print_flags(x, "|",						\
> +		{ NFS4_STID_CLOSED,		"CLOSED" },		\
> +		{ NFS4_STID_REVOKED,		"REVOKED" })		\
> +
>  DECLARE_EVENT_CLASS(nfsd_stid_class,
>  	TP_PROTO(
>  		const struct nfs4_stid *stid
> @@ -666,6 +660,7 @@ DECLARE_EVENT_CLASS(nfsd_stid_class,
>  	TP_ARGS(stid),
>  	TP_STRUCT__entry(
>  		__field(unsigned long, sc_type)
> +		__field(unsigned long, sc_status)
>  		__field(int, sc_count)
>  		__field(u32, cl_boot)
>  		__field(u32, cl_id)
> @@ -676,16 +671,18 @@ DECLARE_EVENT_CLASS(nfsd_stid_class,
>  		const stateid_t *stp = &stid->sc_stateid;
>  
>  		__entry->sc_type = stid->sc_type;
> +		__entry->sc_status = stid->sc_status;
>  		__entry->sc_count = refcount_read(&stid->sc_count);
>  		__entry->cl_boot = stp->si_opaque.so_clid.cl_boot;
>  		__entry->cl_id = stp->si_opaque.so_clid.cl_id;
>  		__entry->si_id = stp->si_opaque.so_id;
>  		__entry->si_generation = stp->si_generation;
>  	),
> -	TP_printk("client %08x:%08x stateid %08x:%08x ref=%d type=%s",
> +	TP_printk("client %08x:%08x stateid %08x:%08x ref=%d type=%s state=%s",
>  		__entry->cl_boot, __entry->cl_id,
>  		__entry->si_id, __entry->si_generation,
> -		__entry->sc_count, show_stid_type(__entry->sc_type)
> +		__entry->sc_count, show_stid_type(__entry->sc_type),
> +		show_stid_status(__entry->sc_status)
>  	)
>  );
>  
> -- 
> 2.42.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
  2023-11-17  2:18 ` [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed NeilBrown
  2023-11-17 11:34   ` Jeff Layton
@ 2023-11-17 19:58   ` Chuck Lever
  2023-11-19 22:47     ` NeilBrown
  1 sibling, 1 reply; 31+ messages in thread
From: Chuck Lever @ 2023-11-17 19:58 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, Nov 17, 2023 at 01:18:52PM +1100, NeilBrown wrote:
> For NFSv4.1 and later the client easily discovers if there is any
> admin-revoked state and will then find and explicitly free it.
> 
> For NFSv4.0 there is no such mechanism.  The client can only find that
> state is admin-revoked if it tries to use that state, and there is no
> way for it to explicitly free the state.  So the server must hold on to
> the stateid (at least) for an indefinite amount of time.  A
> RELEASE_LOCKOWNER request might justify forgetting some of these
> stateids, as would the whole clients lease lapsing, but these are not
> reliable.
> 
> This patch takes two approaches.
> 
> Whenever a client uses an revoked stateid, that stateid is then
> discarded and will not be recognised again.  This might confuse a client
> which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> all, but should mostly work.  Hopefully one error will lead to other
> resources being closed (e.g.  process exits), which will result in more
> stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> 
> Also, any admin-revoked stateids that have been that way for more than
> one lease time are periodically revoke.
> 
> No actual freeing of state happens in this patch.  That will come in
> future patches which handle the different sorts of revoked state.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/netns.h     |  4 ++
>  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 100 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index ec49b200b797..02f8fa095b0f 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -197,6 +197,10 @@ struct nfsd_net {
>  	atomic_t		nfsd_courtesy_clients;
>  	struct shrinker		nfsd_client_shrinker;
>  	struct work_struct	nfsd_shrinker_work;
> +
> +	/* last time an admin-revoke happened for NFSv4.0 */
> +	time64_t		nfs40_last_revoke;
> +
>  };

This hunk doesn't apply to nfsd-next due to v6.7-rc changes to NFSD
to implement a dynamic shrinker. So I stopped my review here for
now.


>  /* Simple check to find out if a given net was properly initialized */
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 8debd148840f..8a1b8376ff08 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1724,6 +1724,14 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
>  				}
>  				nfs4_put_stid(stid);
>  				spin_lock(&nn->client_lock);
> +				if (clp->cl_minorversion == 0)
> +					/* Allow cleanup after a lease period.
> +					 * store_release ensures cleanup will
> +					 * see any newly revoked states if it
> +					 * sees the time updated.
> +					 */
> +					nn->nfs40_last_revoke =
> +						ktime_get_boottime_seconds();
>  				goto retry;
>  			}
>  		}
> @@ -4650,6 +4658,39 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
>  	return ret;
>  }
>  
> +static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
> +{
> +	struct nfs4_client *cl = s->sc_client;
> +
> +	switch (s->sc_type) {
> +	default:
> +		spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
> +static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
> +				    stateid_t *stid)
> +{
> +	/* NFSv4.0 has no way for the client to tell the server
> +	 * that it can forget an admin-revoked stateid.
> +	 * So we keep it around until the first time that the
> +	 * client uses it, and drop it the first time
> +	 * nfserr_admin_revoked is returned.
> +	 * For v4.1 and later we wait until explicitly told
> +	 * to free the stateid.
> +	 */
> +	if (cl->cl_minorversion == 0) {
> +		struct nfs4_stid *st;
> +
> +		spin_lock(&cl->cl_lock);
> +		st = find_stateid_locked(cl, stid);
> +		if (st)
> +			nfsd_drop_revoked_stid(st);
> +		else
> +			spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
>  static __be32
>  nfsd4_verify_open_stid(struct nfs4_stid *s)
>  {
> @@ -4672,6 +4713,10 @@ nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
>  
>  	mutex_lock_nested(&stp->st_mutex, LOCK_STATEID_MUTEX);
>  	ret = nfsd4_verify_open_stid(&stp->st_stid);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(stp->st_stid.sc_client,
> +					&stp->st_stid.sc_stateid);
> +
>  	if (ret != nfs_ok)
>  		mutex_unlock(&stp->st_mutex);
>  	return ret;
> @@ -5255,6 +5300,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
>  	}
>  	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
>  		nfs4_put_stid(&deleg->dl_stid);
> +		nfs40_drop_revoked_stid(cl, &open->op_delegate_stateid);
>  		status = nfserr_deleg_revoked;
>  		goto out;
>  	}
> @@ -6253,6 +6299,43 @@ nfs4_process_client_reaplist(struct list_head *reaplist)
>  	}
>  }
>  
> +static void nfs40_clean_admin_revoked(struct nfsd_net *nn,
> +				      struct laundry_time *lt)
> +{
> +	struct nfs4_client *clp;
> +
> +	spin_lock(&nn->client_lock);
> +	if (nn->nfs40_last_revoke == 0 ||
> +	    nn->nfs40_last_revoke > lt->cutoff) {
> +		spin_unlock(&nn->client_lock);
> +		return;
> +	}
> +	nn->nfs40_last_revoke = 0;
> +
> +retry:
> +	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
> +		unsigned long id, tmp;
> +		struct nfs4_stid *stid;
> +
> +		if (atomic_read(&clp->cl_admin_revoked) == 0)
> +			continue;
> +
> +		spin_lock(&clp->cl_lock);
> +		idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
> +			if (stid->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +				refcount_inc(&stid->sc_count);
> +				spin_unlock(&nn->client_lock);
> +				/* this function drops ->cl_lock */
> +				nfsd_drop_revoked_stid(stid);
> +				nfs4_put_stid(stid);
> +				spin_lock(&nn->client_lock);
> +				goto retry;
> +			}
> +		spin_unlock(&clp->cl_lock);
> +	}
> +	spin_unlock(&nn->client_lock);
> +}
> +
>  static time64_t
>  nfs4_laundromat(struct nfsd_net *nn)
>  {
> @@ -6286,6 +6369,8 @@ nfs4_laundromat(struct nfsd_net *nn)
>  	nfs4_get_client_reaplist(nn, &reaplist, &lt);
>  	nfs4_process_client_reaplist(&reaplist);
>  
> +	nfs40_clean_admin_revoked(nn, &lt);
> +
>  	spin_lock(&state_lock);
>  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> @@ -6504,6 +6589,9 @@ static __be32 nfsd4_stid_check_stateid_generation(stateid_t *in, struct nfs4_sti
>  	if (ret == nfs_ok)
>  		ret = check_stateid_generation(in, &s->sc_stateid, has_session);
>  	spin_unlock(&s->sc_lock);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(s->sc_client,
> +					&s->sc_stateid);
>  	return ret;
>  }
>  
> @@ -6548,6 +6636,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  	}
>  out_unlock:
>  	spin_unlock(&cl->cl_lock);
> +	if (status == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(cl, stateid);
>  	return status;
>  }
>  
> @@ -6594,6 +6684,7 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
>  		return nfserr_deleg_revoked;
>  	}
>  	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
> +		nfs40_drop_revoked_stid(cstate->clp, stateid);
>  		nfs4_put_stid(stid);
>  		return nfserr_admin_revoked;
>  	}
> @@ -6886,6 +6977,11 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	s = find_stateid_locked(cl, stateid);
>  	if (!s || s->sc_status & NFS4_STID_CLOSED)
>  		goto out_unlock;
> +	if (s->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +		nfsd_drop_revoked_stid(s);
> +		ret = nfs_ok;
> +		goto out;
> +	}
>  	spin_lock(&s->sc_lock);
>  	switch (s->sc_type) {
>  	case NFS4_DELEG_STID:
> @@ -6912,7 +7008,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		spin_unlock(&cl->cl_lock);
>  		ret = nfsd4_free_lock_stateid(stateid, s);
>  		goto out;
> -	/* Default falls through and returns nfserr_bad_stateid */
>  	}
>  	spin_unlock(&s->sc_lock);
>  out_unlock:
> -- 
> 2.42.0
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-17 10:59   ` Jeff Layton
@ 2023-11-19 21:37     ` NeilBrown
  2023-11-19 23:38       ` Chuck Lever
  0 siblings, 1 reply; 31+ messages in thread
From: NeilBrown @ 2023-11-19 21:37 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 17 Nov 2023, Jeff Layton wrote:
> On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > The protocol for creating a new state in nfsd is to allocated the state
> > leaving it largely uninitialised, add that state to the ->cl_stateids
> > idr so as to reserve a state id, then complete initialisation of the
> > state and only set ->sc_type to non-zero once the state is fully
> > initialised.
> > 
> > If a state is found in the idr with ->sc_type == 0, it is ignored.
> > The ->cl_lock list is used to avoid races - it is held while checking
> > sc_type during lookup, and held when a non-zero value is stored in
> > ->sc_type.
> > 
> > ... except... hash_delegation_locked() finalises the initialisation of a
> > delegation state, but does NOT hold ->cl_lock.
> > 
> > So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
> > and so ensures there are no races (which are extremely unlikely in any
> > case).
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/nfs4state.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 65fd5510323a..6368788a7d4e 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
> >  
> >  	lockdep_assert_held(&state_lock);
> >  	lockdep_assert_held(&fp->fi_lock);
> > +	lockdep_assert_held(&clp->cl_lock);
> >  
> >  	if (nfs4_delegation_exists(clp, fp))
> >  		return -EAGAIN;
> > @@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> >  		goto out_unlock;
> >  
> >  	spin_lock(&state_lock);
> > +	spin_lock(&clp->cl_lock);
> >  	spin_lock(&fp->fi_lock);
> >  	if (fp->fi_had_conflict)
> >  		status = -EAGAIN;
> >  	else
> >  		status = hash_delegation_locked(dp, fp);
> >  	spin_unlock(&fp->fi_lock);
> > +	spin_unlock(&clp->cl_lock);
> >  	spin_unlock(&state_lock);
> >  
> >  	if (status)
> 
> I know it's (supposedly) an unlikely race, but should we send this to
> stable?

I don't know.  Once upon a time, "stable" meant something.  There was a
clear list of rules.  Those seem to have been torn up.  Now it seems to
be whatever some machine-learning tool chooses.
If that tool chooses this patch (which I suspect it will), I won't
object.  But I don't think it is worth encouraging it.


> 
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> 
Thanks,

NelBrown



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-17 15:04   ` Chuck Lever
@ 2023-11-19 22:07     ` NeilBrown
  0 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-19 22:07 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Sat, 18 Nov 2023, Chuck Lever wrote:
> On Fri, Nov 17, 2023 at 01:18:47PM +1100, NeilBrown wrote:
> > The protocol for creating a new state in nfsd is to allocated the state
> > leaving it largely uninitialised, add that state to the ->cl_stateids
> > idr so as to reserve a state id, then complete initialisation of the
> > state and only set ->sc_type to non-zero once the state is fully
> > initialised.
> > 
> > If a state is found in the idr with ->sc_type == 0, it is ignored.
> > The ->cl_lock list is used to avoid races - it is held while checking
> 
> s/->cl_lock list/->cl_lock lock
> 
> 
> > sc_type during lookup,
> 
> In particular, find_stateid_locked(), but yet, not in nfs4_find_file()
> 
> Can you help me understand why it's not needed in the latter case?

nfs4_find_file() is called from nfs4_check_file() which is called from
nfs4_preprocess_stateid_op(), which gets the nfs4_stid from
nfsd4_lookup_stateid().
That in turn gets the stateid from find_stateid_by_type() which gets it
from find_stateid_locked().

If find_stateid_locked() returns a stateid, then ->sc_type is not 0, and
it can never be set to zero (At least after subsequent patches land).

Or, more succinctly, nfs4_find_file() does not do lookup, so it doesn't
check sc_type against zero, so it doesn't need a lock.

> 
> 
> > and held when a non-zero value is stored in  ->sc_type.
> 
> I see a few additional spots where an sc_type value is set and
> cl_lock is not held:
> 
> init_open_stateid

 ->cl_lock is taken 9 lines before NFS4_OPEN_STID is assigned to
  >sc_type, and it is dropped 13 lines later.

> nfsd4_process_open2

This assignment does not change from zero to non-zero.  So it cannot
race with lookup, which tests for zero.
A later patch changes this assignment to be a change to the new
sc_status.

> 
> 
> > ... except... hash_delegation_locked() finalises the initialisation of a
> > delegation state, but does NOT hold ->cl_lock.
> > 
> > So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
> > and so ensures there are no races (which are extremely unlikely in any
> > case).
> 
> I would have expected that cl_lock should be taken first. Can the
> patch description provide some rationale for the lock ordering
> you chose?

I've added

As ->fi_lock is often taken when ->cl_lock is held, we need to take
->cl_lock first of those two.
Currently ->cl_lock and state_lock are never both taken at the same time.
We need both for this patch so an arbitrary choice is needed concerning
which to take first.  As state_lock is more global, it might be more
contended, so take it first.


I'm happy to choose a different ordering for ->cl_lock and state_lock if
you have a different justification - I accept that mine isn't
particularly strong.

> 
> Jeff asks in another email whether this fix should get copied to
> stable. Since the race is unlikely, I'm inclined to wait for an
> explicit problem report.

I agree.

Thanks,
NeilBrown


> 
> 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/nfs4state.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 65fd5510323a..6368788a7d4e 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
> >  
> >  	lockdep_assert_held(&state_lock);
> >  	lockdep_assert_held(&fp->fi_lock);
> > +	lockdep_assert_held(&clp->cl_lock);
> >  
> >  	if (nfs4_delegation_exists(clp, fp))
> >  		return -EAGAIN;
> > @@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> >  		goto out_unlock;
> >  
> >  	spin_lock(&state_lock);
> > +	spin_lock(&clp->cl_lock);
> >  	spin_lock(&fp->fi_lock);
> >  	if (fp->fi_had_conflict)
> >  		status = -EAGAIN;
> >  	else
> >  		status = hash_delegation_locked(dp, fp);
> >  	spin_unlock(&fp->fi_lock);
> > +	spin_unlock(&clp->cl_lock);
> >  	spin_unlock(&state_lock);
> >  
> >  	if (status)
> > -- 
> > 2.42.0
> > 
> 
> -- 
> Chuck Lever
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
  2023-11-17 19:43     ` Chuck Lever
@ 2023-11-19 22:23       ` NeilBrown
  2023-11-19 23:41         ` Chuck Lever
  0 siblings, 1 reply; 31+ messages in thread
From: NeilBrown @ 2023-11-19 22:23 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Sat, 18 Nov 2023, Chuck Lever wrote:
> On Fri, Nov 17, 2023 at 06:41:37AM -0500, Jeff Layton wrote:
> > On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > > NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in
> > > purpose.
> > > REVOKED is used for NFSv4.1 states which have been revoked because the
> > > lease has expired.  CLOSED is used in other cases.
> > > The difference has two practical effects.
> > > 1/ REVOKED states are on the ->cl_revoked list
> > > 2/ REVOKED states result in nfserr_deleg_revoked from
> > >    nfsd4_verify_open_stid() asnd nfsd4_validate_stateid while
> > >    CLOSED states result in nfserr_bad_stid.
> > > 
> > > Currently a state that is being revoked is first set to "CLOSED" in
> > > unhash_delegation_locked(), then possibly to "REVOKED" in
> > > revoke_delegation(), at which point it is added to the cl_revoked list.
> > > 
> > > It is possible that a stateid test could see the CLOSED state
> > > which really should be REVOKED, and so return the wrong error code.  So
> > > it is safest to remove this window of inconsistency.
> > > 
> > > With this patch, unhash_delegation_locked() always set the state
> > > correctly, and revoke_delegation() no longer changes the state.
> > > 
> > > Also remove a redundant test on minorversion when
> > > NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion
> > > is non-zero.
> > > 
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/nfsd/nfs4state.c | 20 ++++++++++----------
> > >  1 file changed, 10 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 6368788a7d4e..7469583382fb 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -1334,7 +1334,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
> > >  }
> > >  
> > >  static bool
> > > -unhash_delegation_locked(struct nfs4_delegation *dp)
> > > +unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
> 
> unsigned short type ?

At this stage in the series 'type' is still an unsigned char.  I don't
want to get ahead of myself.

> 
> 
> > >  {
> > >  	struct nfs4_file *fp = dp->dl_stid.sc_file;
> > >  
> > > @@ -1343,7 +1343,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp)
> > >  	if (!delegation_hashed(dp))
> > >  		return false;
> > >  
> > > -	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> > > +	if (dp->dl_stid.sc_client->cl_minorversion == 0)
> > > +		type = NFS4_CLOSED_DELEG_STID;
> > > +	dp->dl_stid.sc_type = type;
> > >  	/* Ensure that deleg break won't try to requeue it */
> > >  	++dp->dl_time;
> > >  	spin_lock(&fp->fi_lock);
> > > @@ -1359,7 +1361,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
> > >  	bool unhashed;
> > >  
> > >  	spin_lock(&state_lock);
> > > -	unhashed = unhash_delegation_locked(dp);
> > > +	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
> > >  	spin_unlock(&state_lock);
> > >  	if (unhashed)
> > >  		destroy_unhashed_deleg(dp);
> > > @@ -1373,9 +1375,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
> > >  
> > >  	trace_nfsd_stid_revoke(&dp->dl_stid);
> > >  
> > > -	if (clp->cl_minorversion) {
> > > +	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> > >  		spin_lock(&clp->cl_lock);
> > > -		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
> > >  		refcount_inc(&dp->dl_stid.sc_count);
> > >  		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> > >  		spin_unlock(&clp->cl_lock);
> > > @@ -2234,7 +2235,7 @@ __destroy_client(struct nfs4_client *clp)
> > >  	spin_lock(&state_lock);
> > >  	while (!list_empty(&clp->cl_delegations)) {
> > >  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> > > -		WARN_ON(!unhash_delegation_locked(dp));
> > > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> > >  		list_add(&dp->dl_recall_lru, &reaplist);
> > >  	}
> > >  	spin_unlock(&state_lock);
> > > @@ -5197,8 +5198,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
> > >  		goto out;
> > >  	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> > >  		nfs4_put_stid(&deleg->dl_stid);
> > > -		if (cl->cl_minorversion)
> > > -			status = nfserr_deleg_revoked;
> > > +		status = nfserr_deleg_revoked;
> > >  		goto out;
> > >  	}
> > >  	flags = share_access_to_flags(open->op_share_access);
> > > @@ -6235,7 +6235,7 @@ nfs4_laundromat(struct nfsd_net *nn)
> > >  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> > >  		if (!state_expired(&lt, dp->dl_time))
> > >  			break;
> > > -		WARN_ON(!unhash_delegation_locked(dp));
> > > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
> > >  		list_add(&dp->dl_recall_lru, &reaplist);
> > >  	}
> > >  	spin_unlock(&state_lock);
> > > @@ -8350,7 +8350,7 @@ nfs4_state_shutdown_net(struct net *net)
> > >  	spin_lock(&state_lock);
> > >  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
> > >  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> > > -		WARN_ON(!unhash_delegation_locked(dp));
> > > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> 
> Neil, what say we get rid of these WARN_ONs?
> 

I've added a patch with this intro:
Author: NeilBrown <neilb@suse.de>
Date:   Mon Nov 20 09:15:46 2023 +1100

    nfsd: don't call functions with side-effecting inside WARN_ON()
    
    Code like:
    
        WARN_ON(foo())
    
    looks like an assertion and might not be expected to have any side
    effects.
    When testing if a function with side-effects fails a construct like
    
        if (foo())
           WARN_ON(1);
    
    makes the intent more obvious.
    
    nfsd has several WARN_ON calls where the test has side effects, so it
    would be good to change them.  These cases don't really need the
    WARN_ON.  They have never failed in 8 years of usage so let's just
    remove the WARN_ON wrapper.
    
    Suggested-by: Chuck Lever <chuck.lever@oracle.com>
    Signed-off-by: NeilBrown <neilb@suse.de>

it removes 5 WARN_ONs from unhash_delegation_locked() calls.
They were added by
 Commit 3fcbbd244ed1 ("nfsd: ensure that delegation stateid hash references are only put once")
in 4.3

Thanks,
NeilBrown

> 
> > >  		list_add(&dp->dl_recall_lru, &reaplist);
> > >  	}
> > >  	spin_unlock(&state_lock);
> > 
> > Same question here. Should this go to stable? I guess the race is not
> > generally fatal...
> 
> Again, there's no existing bug report, so no urgency to get this
> backported. I would see these changes soak in upstream rather than
> pull them into stable quickly only to discover another bug has been
> introduced.
> 
> We don't have a failing test or a data corruption risk, as far as
> I can tell.
> 
> 
> > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> 
> -- 
> Chuck Lever
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/9] nfsd: split sc_status out of sc_type
  2023-11-17 19:52   ` Chuck Lever
@ 2023-11-19 22:35     ` NeilBrown
  0 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-19 22:35 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Sat, 18 Nov 2023, Chuck Lever wrote:
> On Fri, Nov 17, 2023 at 01:18:49PM +1100, NeilBrown wrote:
> > sc_type identifies the type of a state - open, lock, deleg, layout - and
> > also the status of a state - closed or revoked.
> > 
> > This is a bit untidy and could get worse when "admin-revoked" states are
> > added.  So clean it up.
> > 
> > With this patch, the type is now all that is stored in sc_type.  This is
> > zero when the state is first added to ->cl_stateids (causing it to be
> > ignored), and is then set appropriately once it is fully initialised.
> > It is set under ->cl_lock to ensure atomicity w.r.t lookup.  It is now
> > never cleared.
> > 
> > sc_type is still a bit-set even though at most one bit is set.  This allows
> > lookup functions to be given a bitmap of acceptable types.
> > 
> > cl_type is now an unsigned short rather than char.  There is no value in
> 
> s/cl_type/sc_type
> 
> 
> > restricting to just 8 bits.  When passed to a function or stored in local
> > variable, "unsigned int" is used.
> 
> This seems confusing, and I'd prefer not to introduce implicit type
> conversions where they aren't truly necessary. Why not use "unsigned
> short" or even "unsigned int" everywhere?
> 

Not really a type conversion - just a size change.
Maybe I could use
   unsigned int sc_type:16, sc_stats:16;
and use unsigned int everywhere?

Maybe it is confusing...  maybe I'll just make it unsigned short
everywhere. 

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states
  2023-11-17 11:27   ` Jeff Layton
@ 2023-11-19 22:36     ` NeilBrown
  0 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-19 22:36 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 17 Nov 2023, Jeff Layton wrote:
> On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > Change the "show" functions to show some content even if a file cannot
> > be found.
> > This is primarily useful for debugging - to ensure states are being
> > removed eventually.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/nfs4state.c | 81 +++++++++++++++++++++++----------------------
> >  1 file changed, 41 insertions(+), 40 deletions(-)
> > 
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index a2b3320a6ba8..8debd148840f 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -2680,17 +2680,10 @@ static int nfs4_show_open(struct seq_file *s, struct nfs4_stid *st)
> >  	struct nfs4_stateowner *oo;
> >  	unsigned int access, deny;
> >  
> > -	if (st->sc_type != NFS4_OPEN_STID && st->sc_type != NFS4_LOCK_STID)
> > -		return 0; /* XXX: or SEQ_SKIP? */
> >  	ols = openlockstateid(st);
> >  	oo = ols->st_stateowner;
> >  	nf = st->sc_file;
> >  
> > -	spin_lock(&nf->fi_lock);
> > -	file = find_any_file_locked(nf);
> > -	if (!file)
> > -		goto out;
> > -
> >  	seq_printf(s, "- ");
> >  	nfs4_show_stateid(s, &st->sc_stateid);
> >  	seq_printf(s, ": { type: open, ");
> > @@ -2705,14 +2698,19 @@ static int nfs4_show_open(struct seq_file *s, struct nfs4_stid *st)
> >  		deny & NFS4_SHARE_ACCESS_READ ? "r" : "-",
> >  		deny & NFS4_SHARE_ACCESS_WRITE ? "w" : "-");
> >  
> > -	nfs4_show_superblock(s, file);
> > -	seq_printf(s, ", ");
> > -	nfs4_show_fname(s, file);
> > -	seq_printf(s, ", ");
> > +	spin_lock(&nf->fi_lock);
> > +	file = find_any_file_locked(nf);
> > +	if (file) {
> > +		nfs4_show_superblock(s, file);
> > +		seq_puts(s, ", ");
> > +		nfs4_show_fname(s, file);
> > +		seq_puts(s, ", ");
> > +	}
> > +	spin_unlock(&nf->fi_lock);
> >  	nfs4_show_owner(s, oo);
> > +	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
> > +		seq_puts(s, ", admin-revoked");
> >  	seq_printf(s, " }\n");
> > -out:
> > -	spin_unlock(&nf->fi_lock);
> >  	return 0;
> >  }
> >  
> > @@ -2726,30 +2724,31 @@ static int nfs4_show_lock(struct seq_file *s, struct nfs4_stid *st)
> >  	ols = openlockstateid(st);
> >  	oo = ols->st_stateowner;
> >  	nf = st->sc_file;
> > -	spin_lock(&nf->fi_lock);
> > -	file = find_any_file_locked(nf);
> > -	if (!file)
> > -		goto out;
> >  
> >  	seq_printf(s, "- ");
> >  	nfs4_show_stateid(s, &st->sc_stateid);
> >  	seq_printf(s, ": { type: lock, ");
> >  
> > -	/*
> > -	 * Note: a lock stateid isn't really the same thing as a lock,
> > -	 * it's the locking state held by one owner on a file, and there
> > -	 * may be multiple (or no) lock ranges associated with it.
> > -	 * (Same for the matter is true of open stateids.)
> > -	 */
> > +	spin_lock(&nf->fi_lock);
> > +	file = find_any_file_locked(nf);
> > +	if (file) {
> > +		/*
> > +		 * Note: a lock stateid isn't really the same thing as a lock,
> > +		 * it's the locking state held by one owner on a file, and there
> > +		 * may be multiple (or no) lock ranges associated with it.
> > +		 * (Same for the matter is true of open stateids.)
> > +		 */
> >  
> > -	nfs4_show_superblock(s, file);
> > -	/* XXX: open stateid? */
> > -	seq_printf(s, ", ");
> > -	nfs4_show_fname(s, file);
> > -	seq_printf(s, ", ");
> > +		nfs4_show_superblock(s, file);
> > +		/* XXX: open stateid? */
> > +		seq_puts(s, ", ");
> > +		nfs4_show_fname(s, file);
> > +		seq_puts(s, ", ");
> > +	}
> >  	nfs4_show_owner(s, oo);
> > +	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
> > +		seq_puts(s, ", admin-revoked");
> >  	seq_printf(s, " }\n");
> > -out:
> >  	spin_unlock(&nf->fi_lock);
> >  	return 0;
> >  }
> > @@ -2762,27 +2761,29 @@ static int nfs4_show_deleg(struct seq_file *s, struct nfs4_stid *st)
> >  
> >  	ds = delegstateid(st);
> >  	nf = st->sc_file;
> > -	spin_lock(&nf->fi_lock);
> > -	file = nf->fi_deleg_file;
> > -	if (!file)
> > -		goto out;
> >  
> >  	seq_printf(s, "- ");
> >  	nfs4_show_stateid(s, &st->sc_stateid);
> >  	seq_printf(s, ": { type: deleg, ");
> >  
> >  	/* Kinda dead code as long as we only support read delegs: */
> 
> nit: You can probably remove the above comment since we now support
> write delegations.

Done.  Thanks for the suggestion.

NeilBrown


> 
> > -	seq_printf(s, "access: %s, ",
> > -		ds->dl_type == NFS4_OPEN_DELEGATE_READ ? "r" : "w");
> > +	seq_printf(s, "access: %s",
> > +		   ds->dl_type == NFS4_OPEN_DELEGATE_READ ? "r" : "w");
> >  
> >  	/* XXX: lease time, whether it's being recalled. */
> >  
> > -	nfs4_show_superblock(s, file);
> > -	seq_printf(s, ", ");
> > -	nfs4_show_fname(s, file);
> > -	seq_printf(s, " }\n");
> > -out:
> > +	spin_lock(&nf->fi_lock);
> > +	file = nf->fi_deleg_file;
> > +	if (file) {
> > +		seq_puts(s, ", ");
> > +		nfs4_show_superblock(s, file);
> > +		seq_puts(s, ", ");
> > +		nfs4_show_fname(s, file);
> > +	}
> >  	spin_unlock(&nf->fi_lock);
> > +	if (st->sc_status & NFS4_STID_ADMIN_REVOKED)
> > +		seq_puts(s, ", admin-revoked");
> > +	seq_puts(s, " }\n");
> >  	return 0;
> >  }
> >  
> 
> -- 
> Jeff Layton <jlayton@kernel.org>
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
  2023-11-17 11:34   ` Jeff Layton
@ 2023-11-19 22:40     ` NeilBrown
  0 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-19 22:40 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 17 Nov 2023, Jeff Layton wrote:
> On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > For NFSv4.1 and later the client easily discovers if there is any
> > admin-revoked state and will then find and explicitly free it.
> > 
> > For NFSv4.0 there is no such mechanism.  The client can only find that
> > state is admin-revoked if it tries to use that state, and there is no
> > way for it to explicitly free the state.  So the server must hold on to
> > the stateid (at least) for an indefinite amount of time.  A
> > RELEASE_LOCKOWNER request might justify forgetting some of these
> > stateids, as would the whole clients lease lapsing, but these are not
> > reliable.
> > 
> > This patch takes two approaches.
> > 
> > Whenever a client uses an revoked stateid, that stateid is then
> > discarded and will not be recognised again.  This might confuse a client
> > which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> > all, but should mostly work.  Hopefully one error will lead to other
> > resources being closed (e.g.  process exits), which will result in more
> > stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> > 
> > Also, any admin-revoked stateids that have been that way for more than
> > one lease time are periodically revoke.
> > 
> 
> Why a single lease period?

Because it was easy to code.

> 
> Is that a long enough time for v4.0? Given that the protocol has no
> mechanism to detect revoked state, I have to wonder if a single lease
> period is enough time for the client to detect the problem?

There is no amount of time that is long enough.
In many cases the client will notice quickly.  In some cases it might
not notice for days or weeks.
Given that there is no perfect solution, and given that v4.0 is really
just a prototype that probably shouldn't be used any more, I don't think
there is any benefit is trying any harder.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
  2023-11-17 19:58   ` Chuck Lever
@ 2023-11-19 22:47     ` NeilBrown
  2023-11-19 23:44       ` Chuck Lever
  0 siblings, 1 reply; 31+ messages in thread
From: NeilBrown @ 2023-11-19 22:47 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Sat, 18 Nov 2023, Chuck Lever wrote:
> On Fri, Nov 17, 2023 at 01:18:52PM +1100, NeilBrown wrote:
> > For NFSv4.1 and later the client easily discovers if there is any
> > admin-revoked state and will then find and explicitly free it.
> > 
> > For NFSv4.0 there is no such mechanism.  The client can only find that
> > state is admin-revoked if it tries to use that state, and there is no
> > way for it to explicitly free the state.  So the server must hold on to
> > the stateid (at least) for an indefinite amount of time.  A
> > RELEASE_LOCKOWNER request might justify forgetting some of these
> > stateids, as would the whole clients lease lapsing, but these are not
> > reliable.
> > 
> > This patch takes two approaches.
> > 
> > Whenever a client uses an revoked stateid, that stateid is then
> > discarded and will not be recognised again.  This might confuse a client
> > which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> > all, but should mostly work.  Hopefully one error will lead to other
> > resources being closed (e.g.  process exits), which will result in more
> > stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> > 
> > Also, any admin-revoked stateids that have been that way for more than
> > one lease time are periodically revoke.
> > 
> > No actual freeing of state happens in this patch.  That will come in
> > future patches which handle the different sorts of revoked state.
> > 
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/netns.h     |  4 ++
> >  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 100 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > index ec49b200b797..02f8fa095b0f 100644
> > --- a/fs/nfsd/netns.h
> > +++ b/fs/nfsd/netns.h
> > @@ -197,6 +197,10 @@ struct nfsd_net {
> >  	atomic_t		nfsd_courtesy_clients;
> >  	struct shrinker		nfsd_client_shrinker;
> >  	struct work_struct	nfsd_shrinker_work;
> > +
> > +	/* last time an admin-revoke happened for NFSv4.0 */
> > +	time64_t		nfs40_last_revoke;
> > +
> >  };
> 
> This hunk doesn't apply to nfsd-next due to v6.7-rc changes to NFSD
> to implement a dynamic shrinker. So I stopped my review here for
> now.

I didn't a rebase onto nfsd-next and there were no conflicts!

I guess the change from
  	struct work_struct	nfsd_shrinker_work;
to
  	struct work_struct	*nfsd_shrinker_work;

was technically a conflict but I'm surprised your tool complained..

NeilBrown

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/9 v3] support admin-revocation of v4 state
  2023-11-17 12:25 ` [PATCH 0/9 v3] support admin-revocation of v4 state Jeff Layton
@ 2023-11-19 23:21   ` NeilBrown
  0 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-19 23:21 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Chuck Lever, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Fri, 17 Nov 2023, Jeff Layton wrote:
> On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > This set adds a prequel series to my previous posting which addresses
> > some locking problems around ->sc_type and splits that field into
> > to separate fields: sc_type and sc_status.
> > The recovation code is modified to accomodate these changed.
> > 
> > Thanks
> > NeilBrown
> > 
> >  [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
> >  [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
> >  [PATCH 3/9] nfsd: split sc_status out of sc_type
> >  [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state
> >  [PATCH 5/9] nfsd: allow admin-revoked state to appear in
> >  [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
> >  [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed
> >  [PATCH 8/9] nfsd: allow open state ids to be revoked and then freed
> >  [PATCH 9/9] nfsd: allow delegation state ids to be revoked and then
> > 
> 
> Nice set. I like this overall. One (other) question: do we need to add
> handling for revoking layout stateids as well?

I guess so... I don't give much thought to layout stateids.
They are used in LAYOUTGET LAYOUTCOMMIT LAYOUTRETURN.
(it's seems odd that they aren't used in READ/WRITE....)

I guess we need to drop the ref on ->ls_file and maybe unlock the vfs
lease....

I wonder if I should just use ->sc_free for that.

Maybe ->sc_free could take a 'revoke_only' arg which causes it to free
resources but not free the stateid itself.
Maybe that would make the code cleaner.

NeilBrown


> 
> I'll plan to pull this into my local branch for some testing over the
> weekend.
> -- 
> Jeff Layton <jlayton@kernel.org>
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-19 21:37     ` NeilBrown
@ 2023-11-19 23:38       ` Chuck Lever
  2023-11-19 23:41         ` NeilBrown
  0 siblings, 1 reply; 31+ messages in thread
From: Chuck Lever @ 2023-11-19 23:38 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Mon, Nov 20, 2023 at 08:37:55AM +1100, NeilBrown wrote:
> On Fri, 17 Nov 2023, Jeff Layton wrote:
> > On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > > The protocol for creating a new state in nfsd is to allocated the state
> > > leaving it largely uninitialised, add that state to the ->cl_stateids
> > > idr so as to reserve a state id, then complete initialisation of the
> > > state and only set ->sc_type to non-zero once the state is fully
> > > initialised.
> > > 
> > > If a state is found in the idr with ->sc_type == 0, it is ignored.
> > > The ->cl_lock list is used to avoid races - it is held while checking
> > > sc_type during lookup, and held when a non-zero value is stored in
> > > ->sc_type.
> > > 
> > > ... except... hash_delegation_locked() finalises the initialisation of a
> > > delegation state, but does NOT hold ->cl_lock.
> > > 
> > > So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
> > > and so ensures there are no races (which are extremely unlikely in any
> > > case).
> > > 
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/nfsd/nfs4state.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 65fd5510323a..6368788a7d4e 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
> > >  
> > >  	lockdep_assert_held(&state_lock);
> > >  	lockdep_assert_held(&fp->fi_lock);
> > > +	lockdep_assert_held(&clp->cl_lock);
> > >  
> > >  	if (nfs4_delegation_exists(clp, fp))
> > >  		return -EAGAIN;
> > > @@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> > >  		goto out_unlock;
> > >  
> > >  	spin_lock(&state_lock);
> > > +	spin_lock(&clp->cl_lock);
> > >  	spin_lock(&fp->fi_lock);
> > >  	if (fp->fi_had_conflict)
> > >  		status = -EAGAIN;
> > >  	else
> > >  		status = hash_delegation_locked(dp, fp);
> > >  	spin_unlock(&fp->fi_lock);
> > > +	spin_unlock(&clp->cl_lock);
> > >  	spin_unlock(&state_lock);
> > >  
> > >  	if (status)
> > 
> > I know it's (supposedly) an unlikely race, but should we send this to
> > stable?
> 
> I don't know.  Once upon a time, "stable" meant something.  There was a
> clear list of rules.  Those seem to have been torn up.  Now it seems to
> be whatever some machine-learning tool chooses.
> If that tool chooses this patch (which I suspect it will), I won't
> object.  But I don't think it is worth encouraging it.

We've asked Sasha and GregKH not to use AUTOSEL on NFSD patches,
promising that we will explicitly mark anything that should be
back-ported.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked()
  2023-11-19 22:23       ` NeilBrown
@ 2023-11-19 23:41         ` Chuck Lever
  0 siblings, 0 replies; 31+ messages in thread
From: Chuck Lever @ 2023-11-19 23:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Mon, Nov 20, 2023 at 09:23:43AM +1100, NeilBrown wrote:
> On Sat, 18 Nov 2023, Chuck Lever wrote:
> > On Fri, Nov 17, 2023 at 06:41:37AM -0500, Jeff Layton wrote:
> > > On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > > > NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in
> > > > purpose.
> > > > REVOKED is used for NFSv4.1 states which have been revoked because the
> > > > lease has expired.  CLOSED is used in other cases.
> > > > The difference has two practical effects.
> > > > 1/ REVOKED states are on the ->cl_revoked list
> > > > 2/ REVOKED states result in nfserr_deleg_revoked from
> > > >    nfsd4_verify_open_stid() asnd nfsd4_validate_stateid while
> > > >    CLOSED states result in nfserr_bad_stid.
> > > > 
> > > > Currently a state that is being revoked is first set to "CLOSED" in
> > > > unhash_delegation_locked(), then possibly to "REVOKED" in
> > > > revoke_delegation(), at which point it is added to the cl_revoked list.
> > > > 
> > > > It is possible that a stateid test could see the CLOSED state
> > > > which really should be REVOKED, and so return the wrong error code.  So
> > > > it is safest to remove this window of inconsistency.
> > > > 
> > > > With this patch, unhash_delegation_locked() always set the state
> > > > correctly, and revoke_delegation() no longer changes the state.
> > > > 
> > > > Also remove a redundant test on minorversion when
> > > > NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion
> > > > is non-zero.
> > > > 
> > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 20 ++++++++++----------
> > > >  1 file changed, 10 insertions(+), 10 deletions(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index 6368788a7d4e..7469583382fb 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -1334,7 +1334,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp)
> > > >  }
> > > >  
> > > >  static bool
> > > > -unhash_delegation_locked(struct nfs4_delegation *dp)
> > > > +unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type)
> > 
> > unsigned short type ?
> 
> At this stage in the series 'type' is still an unsigned char.  I don't
> want to get ahead of myself.
> 
> > 
> > 
> > > >  {
> > > >  	struct nfs4_file *fp = dp->dl_stid.sc_file;
> > > >  
> > > > @@ -1343,7 +1343,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp)
> > > >  	if (!delegation_hashed(dp))
> > > >  		return false;
> > > >  
> > > > -	dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID;
> > > > +	if (dp->dl_stid.sc_client->cl_minorversion == 0)
> > > > +		type = NFS4_CLOSED_DELEG_STID;
> > > > +	dp->dl_stid.sc_type = type;
> > > >  	/* Ensure that deleg break won't try to requeue it */
> > > >  	++dp->dl_time;
> > > >  	spin_lock(&fp->fi_lock);
> > > > @@ -1359,7 +1361,7 @@ static void destroy_delegation(struct nfs4_delegation *dp)
> > > >  	bool unhashed;
> > > >  
> > > >  	spin_lock(&state_lock);
> > > > -	unhashed = unhash_delegation_locked(dp);
> > > > +	unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID);
> > > >  	spin_unlock(&state_lock);
> > > >  	if (unhashed)
> > > >  		destroy_unhashed_deleg(dp);
> > > > @@ -1373,9 +1375,8 @@ static void revoke_delegation(struct nfs4_delegation *dp)
> > > >  
> > > >  	trace_nfsd_stid_revoke(&dp->dl_stid);
> > > >  
> > > > -	if (clp->cl_minorversion) {
> > > > +	if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> > > >  		spin_lock(&clp->cl_lock);
> > > > -		dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID;
> > > >  		refcount_inc(&dp->dl_stid.sc_count);
> > > >  		list_add(&dp->dl_recall_lru, &clp->cl_revoked);
> > > >  		spin_unlock(&clp->cl_lock);
> > > > @@ -2234,7 +2235,7 @@ __destroy_client(struct nfs4_client *clp)
> > > >  	spin_lock(&state_lock);
> > > >  	while (!list_empty(&clp->cl_delegations)) {
> > > >  		dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt);
> > > > -		WARN_ON(!unhash_delegation_locked(dp));
> > > > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> > > >  		list_add(&dp->dl_recall_lru, &reaplist);
> > > >  	}
> > > >  	spin_unlock(&state_lock);
> > > > @@ -5197,8 +5198,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
> > > >  		goto out;
> > > >  	if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) {
> > > >  		nfs4_put_stid(&deleg->dl_stid);
> > > > -		if (cl->cl_minorversion)
> > > > -			status = nfserr_deleg_revoked;
> > > > +		status = nfserr_deleg_revoked;
> > > >  		goto out;
> > > >  	}
> > > >  	flags = share_access_to_flags(open->op_share_access);
> > > > @@ -6235,7 +6235,7 @@ nfs4_laundromat(struct nfsd_net *nn)
> > > >  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> > > >  		if (!state_expired(&lt, dp->dl_time))
> > > >  			break;
> > > > -		WARN_ON(!unhash_delegation_locked(dp));
> > > > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID));
> > > >  		list_add(&dp->dl_recall_lru, &reaplist);
> > > >  	}
> > > >  	spin_unlock(&state_lock);
> > > > @@ -8350,7 +8350,7 @@ nfs4_state_shutdown_net(struct net *net)
> > > >  	spin_lock(&state_lock);
> > > >  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
> > > >  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> > > > -		WARN_ON(!unhash_delegation_locked(dp));
> > > > +		WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID));
> > 
> > Neil, what say we get rid of these WARN_ONs?
> > 
> 
> I've added a patch with this intro:
> Author: NeilBrown <neilb@suse.de>
> Date:   Mon Nov 20 09:15:46 2023 +1100
> 
>     nfsd: don't call functions with side-effecting inside WARN_ON()
>     
>     Code like:
>     
>         WARN_ON(foo())
>     
>     looks like an assertion and might not be expected to have any side
>     effects.
>     When testing if a function with side-effects fails a construct like
>     
>         if (foo())
>            WARN_ON(1);
>     
>     makes the intent more obvious.
>     
>     nfsd has several WARN_ON calls where the test has side effects, so it
>     would be good to change them.  These cases don't really need the
>     WARN_ON.  They have never failed in 8 years of usage so let's just
>     remove the WARN_ON wrapper.
>     
>     Suggested-by: Chuck Lever <chuck.lever@oracle.com>
>     Signed-off-by: NeilBrown <neilb@suse.de>
> 
> it removes 5 WARN_ONs from unhash_delegation_locked() calls.
> They were added by
>  Commit 3fcbbd244ed1 ("nfsd: ensure that delegation stateid hash references are only put once")
> in 4.3

Very sensible, thank you!


> Thanks,
> NeilBrown
> 
> > 
> > > >  		list_add(&dp->dl_recall_lru, &reaplist);
> > > >  	}
> > > >  	spin_unlock(&state_lock);
> > > 
> > > Same question here. Should this go to stable? I guess the race is not
> > > generally fatal...
> > 
> > Again, there's no existing bug report, so no urgency to get this
> > backported. I would see these changes soak in upstream rather than
> > pull them into stable quickly only to discover another bug has been
> > introduced.
> > 
> > We don't have a failing test or a data corruption risk, as far as
> > I can tell.
> > 
> > 
> > > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> > 
> > -- 
> > Chuck Lever
> > 
> 

-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked()
  2023-11-19 23:38       ` Chuck Lever
@ 2023-11-19 23:41         ` NeilBrown
  0 siblings, 0 replies; 31+ messages in thread
From: NeilBrown @ 2023-11-19 23:41 UTC (permalink / raw)
  To: Chuck Lever
  Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Mon, 20 Nov 2023, Chuck Lever wrote:
> On Mon, Nov 20, 2023 at 08:37:55AM +1100, NeilBrown wrote:
> > On Fri, 17 Nov 2023, Jeff Layton wrote:
> > > On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote:
> > > > The protocol for creating a new state in nfsd is to allocated the state
> > > > leaving it largely uninitialised, add that state to the ->cl_stateids
> > > > idr so as to reserve a state id, then complete initialisation of the
> > > > state and only set ->sc_type to non-zero once the state is fully
> > > > initialised.
> > > > 
> > > > If a state is found in the idr with ->sc_type == 0, it is ignored.
> > > > The ->cl_lock list is used to avoid races - it is held while checking
> > > > sc_type during lookup, and held when a non-zero value is stored in
> > > > ->sc_type.
> > > > 
> > > > ... except... hash_delegation_locked() finalises the initialisation of a
> > > > delegation state, but does NOT hold ->cl_lock.
> > > > 
> > > > So this patch takes ->cl_lock at the appropriate time w.r.t other locks,
> > > > and so ensures there are no races (which are extremely unlikely in any
> > > > case).
> > > > 
> > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index 65fd5510323a..6368788a7d4e 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -1317,6 +1317,7 @@ hash_delegation_locked(struct nfs4_delegation *dp, struct nfs4_file *fp)
> > > >  
> > > >  	lockdep_assert_held(&state_lock);
> > > >  	lockdep_assert_held(&fp->fi_lock);
> > > > +	lockdep_assert_held(&clp->cl_lock);
> > > >  
> > > >  	if (nfs4_delegation_exists(clp, fp))
> > > >  		return -EAGAIN;
> > > > @@ -5609,12 +5610,14 @@ nfs4_set_delegation(struct nfsd4_open *open, struct nfs4_ol_stateid *stp,
> > > >  		goto out_unlock;
> > > >  
> > > >  	spin_lock(&state_lock);
> > > > +	spin_lock(&clp->cl_lock);
> > > >  	spin_lock(&fp->fi_lock);
> > > >  	if (fp->fi_had_conflict)
> > > >  		status = -EAGAIN;
> > > >  	else
> > > >  		status = hash_delegation_locked(dp, fp);
> > > >  	spin_unlock(&fp->fi_lock);
> > > > +	spin_unlock(&clp->cl_lock);
> > > >  	spin_unlock(&state_lock);
> > > >  
> > > >  	if (status)
> > > 
> > > I know it's (supposedly) an unlikely race, but should we send this to
> > > stable?
> > 
> > I don't know.  Once upon a time, "stable" meant something.  There was a
> > clear list of rules.  Those seem to have been torn up.  Now it seems to
> > be whatever some machine-learning tool chooses.
> > If that tool chooses this patch (which I suspect it will), I won't
> > object.  But I don't think it is worth encouraging it.
> 
> We've asked Sasha and GregKH not to use AUTOSEL on NFSD patches,
> promising that we will explicitly mark anything that should be
> back-ported.

Oh good - I didn't know that.
Thanks.

NeilBrown

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed.
  2023-11-19 22:47     ` NeilBrown
@ 2023-11-19 23:44       ` Chuck Lever
  0 siblings, 0 replies; 31+ messages in thread
From: Chuck Lever @ 2023-11-19 23:44 UTC (permalink / raw)
  To: NeilBrown; +Cc: Jeff Layton, linux-nfs, Olga Kornievskaia, Dai Ngo, Tom Talpey

On Mon, Nov 20, 2023 at 09:47:03AM +1100, NeilBrown wrote:
> On Sat, 18 Nov 2023, Chuck Lever wrote:
> > On Fri, Nov 17, 2023 at 01:18:52PM +1100, NeilBrown wrote:
> > > For NFSv4.1 and later the client easily discovers if there is any
> > > admin-revoked state and will then find and explicitly free it.
> > > 
> > > For NFSv4.0 there is no such mechanism.  The client can only find that
> > > state is admin-revoked if it tries to use that state, and there is no
> > > way for it to explicitly free the state.  So the server must hold on to
> > > the stateid (at least) for an indefinite amount of time.  A
> > > RELEASE_LOCKOWNER request might justify forgetting some of these
> > > stateids, as would the whole clients lease lapsing, but these are not
> > > reliable.
> > > 
> > > This patch takes two approaches.
> > > 
> > > Whenever a client uses an revoked stateid, that stateid is then
> > > discarded and will not be recognised again.  This might confuse a client
> > > which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> > > all, but should mostly work.  Hopefully one error will lead to other
> > > resources being closed (e.g.  process exits), which will result in more
> > > stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> > > 
> > > Also, any admin-revoked stateids that have been that way for more than
> > > one lease time are periodically revoke.
> > > 
> > > No actual freeing of state happens in this patch.  That will come in
> > > future patches which handle the different sorts of revoked state.
> > > 
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/nfsd/netns.h     |  4 ++
> > >  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
> > >  2 files changed, 100 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > index ec49b200b797..02f8fa095b0f 100644
> > > --- a/fs/nfsd/netns.h
> > > +++ b/fs/nfsd/netns.h
> > > @@ -197,6 +197,10 @@ struct nfsd_net {
> > >  	atomic_t		nfsd_courtesy_clients;
> > >  	struct shrinker		nfsd_client_shrinker;
> > >  	struct work_struct	nfsd_shrinker_work;
> > > +
> > > +	/* last time an admin-revoke happened for NFSv4.0 */
> > > +	time64_t		nfs40_last_revoke;
> > > +
> > >  };
> > 
> > This hunk doesn't apply to nfsd-next due to v6.7-rc changes to NFSD
> > to implement a dynamic shrinker. So I stopped my review here for
> > now.
> 
> I didn't a rebase onto nfsd-next and there were no conflicts!
> 
> I guess the change from
>   	struct work_struct	nfsd_shrinker_work;
> to
>   	struct work_struct	*nfsd_shrinker_work;
> 
> was technically a conflict but I'm surprised your tool complained..

That was the change I noticed when it failed to apply. "stg import"
is pretty picky, unfortunately...


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-11-19 23:44 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-17  2:18 [PATCH 0/9 v3] support admin-revocation of v4 state NeilBrown
2023-11-17  2:18 ` [PATCH 1/9] nfsd: hold ->cl_lock for hash_delegation_locked() NeilBrown
2023-11-17 10:59   ` Jeff Layton
2023-11-19 21:37     ` NeilBrown
2023-11-19 23:38       ` Chuck Lever
2023-11-19 23:41         ` NeilBrown
2023-11-17 15:04   ` Chuck Lever
2023-11-19 22:07     ` NeilBrown
2023-11-17  2:18 ` [PATCH 2/9] nfsd: avoid race after unhash_delegation_locked() NeilBrown
2023-11-17 11:41   ` Jeff Layton
2023-11-17 19:43     ` Chuck Lever
2023-11-19 22:23       ` NeilBrown
2023-11-19 23:41         ` Chuck Lever
2023-11-17  2:18 ` [PATCH 3/9] nfsd: split sc_status out of sc_type NeilBrown
2023-11-17 19:52   ` Chuck Lever
2023-11-19 22:35     ` NeilBrown
2023-11-17  2:18 ` [PATCH 4/9] nfsd: prepare for supporting admin-revocation of state NeilBrown
2023-11-17  2:18 ` [PATCH 5/9] nfsd: allow admin-revoked state to appear in /proc/fs/nfsd/clients/*/states NeilBrown
2023-11-17 11:27   ` Jeff Layton
2023-11-19 22:36     ` NeilBrown
2023-11-17  2:18 ` [PATCH 6/9] nfsd: allow admin-revoked NFSv4.0 state to be freed NeilBrown
2023-11-17 11:34   ` Jeff Layton
2023-11-19 22:40     ` NeilBrown
2023-11-17 19:58   ` Chuck Lever
2023-11-19 22:47     ` NeilBrown
2023-11-19 23:44       ` Chuck Lever
2023-11-17  2:18 ` [PATCH 7/9] nfsd: allow lock state ids to be revoked and then freed NeilBrown
2023-11-17  2:18 ` [PATCH 8/9] nfsd: allow open " NeilBrown
2023-11-17  2:18 ` [PATCH 9/9] nfsd: allow delegation " NeilBrown
2023-11-17 12:25 ` [PATCH 0/9 v3] support admin-revocation of v4 state Jeff Layton
2023-11-19 23:21   ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox