[PATCH AUTOSEL 6.18-5.10] NFSv4: ensure the open stateid seqid doesn't go backwards

linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Scott Mayhew <smayhew@redhat.com>,
	Benjamin Coddington <bcodding@hammerspace.com>,
	Trond Myklebust <trond.myklebust@hammerspace.com>,
	Sasha Levin <sashal@kernel.org>,
	trondmy@kernel.org, anna@kernel.org, linux-nfs@vger.kernel.org
Subject: [PATCH AUTOSEL 6.18-5.10] NFSv4: ensure the open stateid seqid doesn't go backwards
Date: Sun, 14 Dec 2025 19:41:22 -0500	[thread overview]
Message-ID: <20251215004145.2760442-5-sashal@kernel.org> (raw)
In-Reply-To: <20251215004145.2760442-1-sashal@kernel.org>

From: Scott Mayhew <smayhew@redhat.com>

[ Upstream commit 2e47c3cc64b44b0b06cd68c2801db92ff143f2b2 ]

We have observed an NFSv4 client receiving a LOCK reply with a status of
NFS4ERR_OLD_STATEID and subsequently retrying the LOCK request with an
earlier seqid value in the stateid.  As this was for a new lockowner,
that would imply that nfs_set_open_stateid_locked() had updated the open
stateid seqid with an earlier value.

Looking at nfs_set_open_stateid_locked(), if the incoming seqid is out
of sequence, the task will sleep on the state->waitq for up to 5
seconds.  If the task waits for the full 5 seconds, then after finishing
the wait it'll update the open stateid seqid with whatever value the
incoming seqid has.  If there are multiple waiters in this scenario,
then the last one to perform said update may not be the one with the
highest seqid.

Add a check to ensure that the seqid can only be incremented, and add a
tracepoint to indicate when old seqids are skipped.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---

LLM Generated explanations, may be completely bogus:

## Analysis Summary

### 3. Bug Classification and Severity

**This is clearly a BUG FIX**, not a feature addition:
- **Observed production issue**: NFS4ERR_OLD_STATEID errors
- **Race condition bug**: Multiple waiters can corrupt stateid ordering
- **Protocol compliance**: Client violates NFSv4 protocol by using stale
  seqids
- **Potential data integrity risk**: Lock failures can lead to data
  corruption in concurrent access scenarios

**The root cause explained:**
When `nfs_set_open_stateid_locked()` detects an out-of-sequence stateid,
it waits up to 5 seconds on `state->waitq`. If multiple waiters all
timeout (status = -EAGAIN), they all proceed to update the stateid. The
last one wins, but it may have an **older** seqid than an earlier
updater. This causes seqid to go backwards.

**The fix:**
Before allowing the update after a timeout, check if the incoming
stateid is actually newer than the current one using
`nfs4_stateid_is_newer()`. If not, skip the update entirely and return
early.

### 4. Scope and Risk Assessment

| Aspect | Assessment |
|--------|------------|
| Lines changed | ~15 lines in core logic |
| Files touched | 2 (nfs4proc.c, nfs4trace.h) |
| Dependencies | Uses existing helpers from 2019 |
| New APIs | None |
| Risk | LOW - adds a missing safety check |

The tracepoint addition (`nfs4_open_stateid_update_skip`) uses an
existing event class - it's purely diagnostic.

### 5. Stable Criteria Checklist

- ✅ **Obviously correct**: Simple check using well-established helper
  functions
- ✅ **Fixes real bug**: Observed NFS4ERR_OLD_STATEID in production
- ✅ **Important issue**: I/O failures, potential data integrity issues
- ✅ **Small and contained**: ~15 lines, single function fix
- ✅ **No new features**: Pure bug fix with diagnostic tracepoint
- ✅ **No new APIs**: Uses existing nfs4_stateid_* helpers from 2019
- ✅ **Reviewed**: By Benjamin Coddington (NFS expert)
- ✅ **Signed-off**: By Trond Myklebust (primary NFS maintainer)

### 6. User Impact

- **Who is affected**: Any NFSv4 user with concurrent file access
  patterns
- **Severity**: Medium-High - can cause I/O failures, lock errors, and
  potential data corruption
- **Enterprise impact**: NFS is heavily used in enterprise environments;
  this is an important fix

### 7. Dependencies Check

The helper functions used (`nfs4_stateid_match_other`,
`nfs4_stateid_is_newer`) have been in the kernel since 2019 and are used
by 24 and 11 callers respectively. They will exist in all active stable
trees (6.1.y, 6.6.y, etc.).

### Conclusion

This is an excellent stable backport candidate:
- Fixes a real, observed bug that causes protocol violations and I/O
  failures
- Small, surgical fix with clear logic
- Uses existing infrastructure - no dependency issues
- Reviewed and signed-off by NFS maintainers
- Low risk of regression - it adds a missing safety check

The fix prevents NFSv4 stateid seqid from going backwards, which is a
correctness issue that can impact production NFS users.

**YES**

 fs/nfs/nfs4proc.c  | 13 +++++++++++--
 fs/nfs/nfs4trace.h |  1 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 93c6ce04332b8..d1ad2ef2ee6f9 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1780,8 +1780,17 @@ static void nfs_set_open_stateid_locked(struct nfs4_state *state,
 		if (nfs_stateid_is_sequential(state, stateid))
 			break;

-		if (status)
-			break;
+		if (status) {
+			if (nfs4_stateid_match_other(stateid, &state->open_stateid) &&
+			    !nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
+				trace_nfs4_open_stateid_update_skip(state->inode,
+								    stateid, status);
+				return;
+			} else {
+				break;
+			}
+		}
+
 		/* Rely on seqids for serialisation with NFSv4.0 */
 		if (!nfs4_has_session(NFS_SERVER(state->inode)->nfs_client))
 			break;
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 9776d220cec33..6285128e631a5 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -1353,6 +1353,7 @@ DEFINE_NFS4_INODE_STATEID_EVENT(nfs4_setattr);
 DEFINE_NFS4_INODE_STATEID_EVENT(nfs4_delegreturn);
 DEFINE_NFS4_INODE_STATEID_EVENT(nfs4_open_stateid_update);
 DEFINE_NFS4_INODE_STATEID_EVENT(nfs4_open_stateid_update_wait);
+DEFINE_NFS4_INODE_STATEID_EVENT(nfs4_open_stateid_update_skip);
 DEFINE_NFS4_INODE_STATEID_EVENT(nfs4_close_stateid_update_wait);

 DECLARE_EVENT_CLASS(nfs4_getattr_event,
-- 
2.51.0

next      parent reply	other threads:[~2025-12-15  0:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20251215004145.2760442-1-sashal@kernel.org>
2025-12-15  0:41 ` Sasha Levin [this message]
2025-12-15  0:41 ` [PATCH AUTOSEL 6.18-5.10] NFS: Fix up the automount fs_context to use the correct cred Sasha Levin

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:93c6ce04332b dfblob:d1ad2ef2ee6f dfblob:9776d220cec3
dfblob:6285128e631a )
 OR (
bs:"[PATCH AUTOSEL 6.18-5.10] NFSv4: ensure the open stateid seqid doesn't go backwards" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251215004145.2760442-5-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=anna@kernel.org \
    --cc=bcodding@hammerspace.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=smayhew@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=trond.myklebust@hammerspace.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).