linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch v4 0/3] NFSD: Fix server hang when there are multiple layout conflicts
@ 2025-11-15 19:16 Dai Ngo
  2025-11-15 19:16 ` [PATCH v4 1/3] locks: Introduce lm_breaker_timedout operation to lease_manager_operations Dai Ngo
                   ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Dai Ngo @ 2025-11-15 19:16 UTC (permalink / raw)
  To: chuck.lever, jlayton, neilb, okorniev, tom, hch, alex.aring, viro,
	brauner, jack
  Cc: linux-fsdevel, linux-nfs

NFSD: Fix server hang when there are multiple layout conflicts

When a layout conflict triggers a call to __break_lease, the function
nfsd4_layout_lm_break clears the fl_break_time timeout before sending
the CB_LAYOUTRECALL. As a result, __break_lease repeatedly restarts
its loop, waiting indefinitely for the conflicting file lease to be
released.

If the number of lease conflicts matches the number of NFSD threads (which
defaults to 8), all available NFSD threads become occupied. onsequently,
there are no threads left to handle incoming requests or callback replies,
leading to a total hang of the NFSD server.

This issue is reliably reproducible by running the Git test suite on a
configuration using the SCSI layout.

This patchset fixes this problem by introducing the new lm_breaker_timedout
operation to lease_manager_operations and enforcing timeout for layout
lease break.

V2:
. replace int with u32 for ls_layout_type in nfsd_layout_breaker_timedout.

. add mechanism to ensure threads wait in __break_lease for layout conflict
  must wait until one of the waiting threads done with the fencing operation
  before these threads can continue.

V3:
. break the patchset into 3 patches:
  (1/3) locks: Introduce lm_breaker_timedout operation to lease_manager_operations.
        No change from V1
  (2/3) locks: Threads with layout conflict must wait until the client was fenced.
        New patch: add synchronization in __break_lease to ensure threads with
        layout conflict must wait until the client was fenced.
  (3/3) NFSD: Fix NFS server hang when there are multiple layout conflicts
        Replace int with u32 for ls_layout_type in nfsd_layout_breaker_timedout
V4:
. (2/3) add wait_queue_head_t in file_lock_context for threads to wait
        until fencing is done.

 Documentation/filesystems/locking.rst |  2 ++
 fs/locks.c                            | 38 +++++++++++++++++++++++++++---
 fs/nfsd/nfs4layouts.c                 | 26 ++++++++++++++++----
 include/linux/filelock.h              |  7 ++++++
 4 files changed, 66 insertions(+), 7 deletions(-)

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2025-11-20  6:59 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-15 19:16 [Patch v4 0/3] NFSD: Fix server hang when there are multiple layout conflicts Dai Ngo
2025-11-15 19:16 ` [PATCH v4 1/3] locks: Introduce lm_breaker_timedout operation to lease_manager_operations Dai Ngo
2025-11-17 15:39   ` Chuck Lever
2025-11-17 19:17     ` Dai Ngo
2025-11-17 18:02   ` Jeff Layton
2025-11-17 19:41     ` Dai Ngo
2025-11-19 13:52       ` Jeff Layton
2025-11-19 16:32         ` Dai Ngo
2025-11-19  9:54   ` Christoph Hellwig
2025-11-19 14:04     ` Chuck Lever
2025-11-15 19:16 ` [PATCH v4 2/3] locks: Threads with layout conflict must wait until client was fenced Dai Ngo
2025-11-17 15:47   ` Chuck Lever
2025-11-17 19:21     ` Dai Ngo
2025-11-17 18:21   ` Jeff Layton
2025-11-17 19:49     ` Dai Ngo
2025-11-19  9:53     ` Christoph Hellwig
2025-11-15 19:16 ` [PATCH v4 3/3] FSD: Fix NFS server hang when there are multiple layout conflicts Dai Ngo
2025-11-15 19:44   ` Chuck Lever
2025-11-15 20:20     ` Dai Ngo
2025-11-19  9:56       ` Christoph Hellwig
2025-11-19 16:35         ` Dai Ngo
2025-11-17 15:55   ` Chuck Lever
2025-11-17 19:40     ` Dai Ngo
2025-11-17 21:13       ` Benjamin Coddington
2025-11-17 22:00         ` Dai Ngo
2025-11-19 10:08           ` Christoph Hellwig
2025-11-19 16:52             ` Dai Ngo
2025-11-20  6:50               ` Christoph Hellwig
2025-11-19 10:05       ` Christoph Hellwig
2025-11-19 14:04         ` Benjamin Coddington
2025-11-19 14:09         ` Chuck Lever
2025-11-19 14:12           ` Jeff Layton
2025-11-19 17:06           ` Dai Ngo
2025-11-20  6:52             ` Christoph Hellwig
2025-11-20  6:59           ` Christoph Hellwig
2025-11-19  9:57     ` Christoph Hellwig
2025-11-19 10:08   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).