* [PATCH v5 0/2] xfs: resolve close() deadlocks on frozen filesystems
@ 2026-06-16 5:38 Aditya Srivastava
2026-06-16 5:38 ` [PATCH v5 1/2] xfs: add a XFS_TRANS_WRITECOUNT_TRYLOCK flag Aditya Srivastava
2026-06-16 5:38 ` [PATCH v5 2/2] xfs: prevent close() from hanging on frozen filesystems Aditya Srivastava
0 siblings, 2 replies; 4+ messages in thread
From: Aditya Srivastava @ 2026-06-16 5:38 UTC (permalink / raw)
To: Carlos Maiolino, Christoph Hellwig
Cc: linux-xfs, linux-kernel, Aditya Prakash Srivastava
From: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
Hi Carlos and Christoph,
This is version 5 of the patch series addressing the close() system call
hanging indefinitely on frozen XFS filesystems (Bugzilla #205833).
Based on Christoph's feedback, I have made the following improvements:
- Added Christoph Hellwig's Reviewed-by tag to Patch 1.
- Wrapped the overly long xfs_trans_alloc line inside xfs_free_eofblocks()
to conform to Linux line length guidelines.
- Corrected the Patch 2 commit log phrasing to state that we are
adding the trans_flags parameter rather than renaming it compared
to upstream.
As requested, I have also submitted the corresponding regression test to
the xfstests mailing list (using tests/xfs/842 with a GPLv2-licensed
helper program). The fstests maintainer (Zorro Lang) reviewed the test
and has indicated that he will wait for this kernel patch to make it
through before merging the test suite additions.
THE REAL-WORLD IMPACT (BUGZILLA & DOWNSTREAM CASES)
==================================================
When speculative post-EOF blocks are closed on XFS, the release path
synchronously attempts to free them via xfs_free_eofblocks(). This
allocates a write transaction (xfs_trans_alloc) which blocks
indefinitely on the superblock freeze write lock (sb_start_intwrite)
under fsfreeze.
This behavior has a long history of causing severe system disruption:
- Downstream Red Hat Bugzilla 1474726 (dating back to 2017) details
complete system hangs during system backups when rsync and fsfreeze
are used. Even seemingly harmless read-only commands like
'cat /var/log/messages' would hang on close() in __sb_start_write
via xfs_free_eofblocks, requiring a hard reboot.
- Downstream LeApp integration test scenarios consistently hit this hang.
Hanging on close() frequently triggers container healthcheck failures,
systemd service timeouts, and cluster failover cascades, which is
disruptive to user-space applications that view close() as resource
reclamation. No other major Linux filesystem (ext4, btrfs, etc.)
synchronously allocates write transactions during close() system calls.
THE SOLUTION: NON-BLOCKING SUPERBLOCK TRYLOCK
=============================================
Instead of performing racy pre-checks, this series introduces
XFS_TRANS_WRITECOUNT_TRYLOCK. When specified, __xfs_trans_alloc()
attempts to obtain freeze protection using sb_start_intwrite_trylock().
If that fails, it aborts allocation gracefully and returns -EAGAIN.
We then pass XFS_TRANS_WRITECOUNT_TRYLOCK during xfs_file_release(). If
the truncation fails due to a frozen filesystem (-EAGAIN), we cleanly
bypass setting XFS_EOFBLOCKS_RELEASED on the inode, ensuring subsequent
releases or the background blockgc garbage collector can successfully
clean them up once thawed.
REPRODUCER DETAILS (GPLV2 LICENSED)
===================================
As requested, I have added a GPLv2-compatible license to the C
reproducer provided below, and I have also sent a corresponding patch to
the xfstests mailing list. The fstests maintainer (Zorro Lang) reviewed
the patch and indicated that he will wait for this kernel patch series
to be merged before pulling the test suite additions.
Compile with -pthread:
/*
* GPLv2-compatible XFS freeze close() hang reproducer.
* Copyright (c) 2026 Aditya Prakash Srivastava. All Rights Reserved.
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#include <sys/ioctl.h>
#include <sys/vfs.h>
#include <linux/fs.h>
#include <libgen.h>
volatile int close_started = 0;
volatile int close_completed = 0;
void *close_thread(void *arg) {
int fd = *(int *)arg;
close_started = 1;
close(fd);
close_completed = 1;
return NULL;
}
int main(int argc, char *argv[]) {
struct statfs sfs;
if (statfs(argv[1], &sfs) < 0) {
char *dir_buf = strdup(argv[1]);
char *parent_dir = dirname(dir_buf);
if (statfs(parent_dir, &sfs) < 0) {
perror("statfs");
free(dir_buf);
return 1;
}
free(dir_buf);
}
if (sfs.f_type != 0x58465342) return 1;
int freeze_fd = open(dirname(strdup(argv[1])), O_RDONLY);
int write_fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0644);
char buf[65536] = {0};
for (int i = 0; i < 320; i++) write(write_fd, buf, sizeof(buf));
ioctl(freeze_fd, FIFREEZE, 0);
pthread_t thread;
pthread_create(&thread, NULL, close_thread, &write_fd);
while (!close_started) usleep(1000);
usleep(1000000); // Wait 1s
if (!close_completed) printf("SUCCESS: close() hung!\n");
ioctl(freeze_fd, FITHAW, 0);
pthread_join(thread, NULL);
unlink(argv[1]);
return 0;
}
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205833
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1474726
Aditya Prakash Srivastava (2):
xfs: add a XFS_TRANS_WRITECOUNT_TRYLOCK flag
xfs: prevent close() from hanging on frozen filesystems
fs/xfs/libxfs/xfs_shared.h | 3 +++
fs/xfs/xfs_bmap_util.c | 10 ++++++----
fs/xfs/xfs_bmap_util.h | 2 +-
fs/xfs/xfs_file.c | 8 +++++---
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_inode.c | 2 +-
fs/xfs/xfs_trans.c | 12 +++++++++++-
7 files changed, 28 insertions(+), 11 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v5 1/2] xfs: add a XFS_TRANS_WRITECOUNT_TRYLOCK flag
2026-06-16 5:38 [PATCH v5 0/2] xfs: resolve close() deadlocks on frozen filesystems Aditya Srivastava
@ 2026-06-16 5:38 ` Aditya Srivastava
2026-06-16 5:38 ` [PATCH v5 2/2] xfs: prevent close() from hanging on frozen filesystems Aditya Srivastava
1 sibling, 0 replies; 4+ messages in thread
From: Aditya Srivastava @ 2026-06-16 5:38 UTC (permalink / raw)
To: Carlos Maiolino, Christoph Hellwig
Cc: linux-xfs, linux-kernel, Aditya Prakash Srivastava
From: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
Introduce a new transaction allocation flag, XFS_TRANS_WRITECOUNT_TRYLOCK.
When this flag is specified, __xfs_trans_alloc() attempts to obtain
freeze protection using sb_start_intwrite_trylock() instead of blocking
indefinitely on sb_start_intwrite().
If the trylock fails, the allocation is aborted gracefully: the freshly
allocated transaction handle is freed, and the function returns the
appropriate error pointer ERR_PTR(-EAGAIN), which is then propagated
to the caller by xfs_trans_alloc().
Also add an assertion in __xfs_trans_alloc() to ensure that both
XFS_TRANS_NO_WRITECOUNT and XFS_TRANS_WRITECOUNT_TRYLOCK are never
specified at the same time, as they are mutually exclusive.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
---
fs/xfs/libxfs/xfs_shared.h | 3 +++
fs/xfs/xfs_trans.c | 12 +++++++++++-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h
index b1e0d9bc1f7d..68d22b6cddd3 100644
--- a/fs/xfs/libxfs/xfs_shared.h
+++ b/fs/xfs/libxfs/xfs_shared.h
@@ -164,6 +164,9 @@ void xfs_log_get_max_trans_res(struct xfs_mount *mp,
/* Transaction has locked the rtbitmap and rtsum inodes */
#define XFS_TRANS_RTBITMAP_LOCKED (1u << 9)
+/* Try lock filesystem superblock for freeze protection */
+#define XFS_TRANS_WRITECOUNT_TRYLOCK (1u << 10)
+
/*
* Field values for xfs_trans_mod_sb.
*/
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 148cc32449c1..3860e44d6439 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -216,10 +216,18 @@ __xfs_trans_alloc(
struct xfs_trans *tp;
ASSERT(!(flags & XFS_TRANS_RES_FDBLKS) || xfs_has_lazysbcount(mp));
+ ASSERT(!((flags & XFS_TRANS_NO_WRITECOUNT) &&
+ (flags & XFS_TRANS_WRITECOUNT_TRYLOCK)));
tp = kmem_cache_zalloc(xfs_trans_cache, GFP_KERNEL | __GFP_NOFAIL);
- if (!(flags & XFS_TRANS_NO_WRITECOUNT))
+ if (flags & XFS_TRANS_WRITECOUNT_TRYLOCK) {
+ if (!sb_start_intwrite_trylock(mp->m_super)) {
+ kmem_cache_free(xfs_trans_cache, tp);
+ return ERR_PTR(-EAGAIN);
+ }
+ } else if (!(flags & XFS_TRANS_NO_WRITECOUNT)) {
sb_start_intwrite(mp->m_super);
+ }
xfs_trans_set_context(tp);
tp->t_flags = flags;
tp->t_mountp = mp;
@@ -252,6 +260,8 @@ xfs_trans_alloc(
*/
retry:
tp = __xfs_trans_alloc(mp, flags);
+ if (IS_ERR(tp))
+ return PTR_ERR(tp);
WARN_ON(mp->m_super->s_writers.frozen == SB_FREEZE_COMPLETE);
error = xfs_trans_reserve(tp, resp, blocks, rtextents);
if (error == -ENOSPC && want_retry) {
--
2.47.3
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v5 2/2] xfs: prevent close() from hanging on frozen filesystems
2026-06-16 5:38 [PATCH v5 0/2] xfs: resolve close() deadlocks on frozen filesystems Aditya Srivastava
2026-06-16 5:38 ` [PATCH v5 1/2] xfs: add a XFS_TRANS_WRITECOUNT_TRYLOCK flag Aditya Srivastava
@ 2026-06-16 5:38 ` Aditya Srivastava
2026-06-16 13:04 ` Christoph Hellwig
1 sibling, 1 reply; 4+ messages in thread
From: Aditya Srivastava @ 2026-06-16 5:38 UTC (permalink / raw)
To: Carlos Maiolino, Christoph Hellwig
Cc: linux-xfs, linux-kernel, Aditya Prakash Srivastava
From: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
When a file with active speculative post-EOF preallocations is closed,
xfs_file_release() synchronously triggers xfs_free_eofblocks() to clean
them up. This requires allocating a write transaction (xfs_trans_alloc),
which blocks indefinitely if the filesystem is currently frozen or in the
process of freezing, as it waits to acquire the superblock's write lock.
As a result, a close() system call on a read-write file descriptor can
hang indefinitely in percpu_rwsem_wait() until the filesystem is thawed,
even if the file is closed by a non-writer process or after all writing
activity has already ceased.
To fix this properly and avoid any potential race conditions where a freeze
might come in immediately after a writable check, pass the new
XFS_TRANS_WRITECOUNT_TRYLOCK flag to xfs_trans_alloc() when freeing
speculative preallocations in xfs_file_release().
If xfs_free_eofblocks() returns -EAGAIN on a trylock failure, we cleanly
bypass setting XFS_EOFBLOCKS_RELEASED on the inode, ensuring subsequent
releases or the background blockgc garbage collector can successfully retry
the cleanup once the filesystem thaws.
Also, add the new trans_flags parameter to xfs_free_eofblocks() to make
its usage stand out, and update existing callers to pass 0 to preserve
standard blocking paths.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205833
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1474726
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
---
fs/xfs/xfs_bmap_util.c | 10 ++++++----
fs/xfs/xfs_bmap_util.h | 2 +-
fs/xfs/xfs_file.c | 8 +++++---
fs/xfs/xfs_icache.c | 2 +-
fs/xfs/xfs_inode.c | 2 +-
5 files changed, 14 insertions(+), 10 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 0ab00615f1ad..a99aae4a1631 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -574,7 +574,8 @@ xfs_can_free_eofblocks(
*/
int
xfs_free_eofblocks(
- struct xfs_inode *ip)
+ struct xfs_inode *ip,
+ uint trans_flags)
{
struct xfs_trans *tp;
struct xfs_mount *mp = ip->i_mount;
@@ -604,9 +605,10 @@ xfs_free_eofblocks(
return 0;
}
- error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
+ error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0,
+ trans_flags, &tp);
if (error) {
- ASSERT(xfs_is_shutdown(mp));
+ ASSERT(error == -EAGAIN || xfs_is_shutdown(mp));
return error;
}
@@ -928,7 +930,7 @@ xfs_prepare_shift(
* into the accessible region of the file.
*/
if (xfs_can_free_eofblocks(ip)) {
- error = xfs_free_eofblocks(ip);
+ error = xfs_free_eofblocks(ip, 0);
if (error)
return error;
}
diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
index c477b3361630..c13774aa0892 100644
--- a/fs/xfs/xfs_bmap_util.h
+++ b/fs/xfs/xfs_bmap_util.h
@@ -66,7 +66,7 @@ int xfs_insert_file_space(struct xfs_inode *, xfs_off_t offset,
/* EOF block manipulation functions */
bool xfs_can_free_eofblocks(struct xfs_inode *ip);
-int xfs_free_eofblocks(struct xfs_inode *ip);
+int xfs_free_eofblocks(struct xfs_inode *ip, uint trans_flags);
int xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
struct xfs_swapext *sx);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 845a97c9b063..76c9b2fe7c51 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1806,9 +1806,11 @@ xfs_file_release(
*/
if (!xfs_iflags_test(ip, XFS_EOFBLOCKS_RELEASED) &&
xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
- if (xfs_can_free_eofblocks(ip) &&
- !xfs_iflags_test_and_set(ip, XFS_EOFBLOCKS_RELEASED))
- xfs_free_eofblocks(ip);
+ if (!xfs_iflags_test(ip, XFS_EOFBLOCKS_RELEASED) &&
+ xfs_can_free_eofblocks(ip) &&
+ !xfs_free_eofblocks(ip, XFS_TRANS_WRITECOUNT_TRYLOCK))
+ xfs_iflags_set(ip, XFS_EOFBLOCKS_RELEASED);
+
xfs_iunlock(ip, XFS_IOLOCK_EXCL);
}
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 2040a9292ee6..c575b4acb24c 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1259,7 +1259,7 @@ xfs_inode_free_eofblocks(
*lockflags |= XFS_IOLOCK_EXCL;
if (xfs_can_free_eofblocks(ip))
- return xfs_free_eofblocks(ip);
+ return xfs_free_eofblocks(ip, 0);
/* inode could be preallocated */
trace_xfs_inode_free_eofblocks_invalid(ip);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index ddf2707c8894..14d3cd04a79f 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1423,7 +1423,7 @@ xfs_inactive(
* reference to the inode at this point anyways.
*/
if (xfs_can_free_eofblocks(ip))
- error = xfs_free_eofblocks(ip);
+ error = xfs_free_eofblocks(ip, 0);
goto out;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v5 2/2] xfs: prevent close() from hanging on frozen filesystems
2026-06-16 5:38 ` [PATCH v5 2/2] xfs: prevent close() from hanging on frozen filesystems Aditya Srivastava
@ 2026-06-16 13:04 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2026-06-16 13:04 UTC (permalink / raw)
To: Aditya Srivastava
Cc: Carlos Maiolino, Christoph Hellwig, linux-xfs, linux-kernel
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-16 13:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 5:38 [PATCH v5 0/2] xfs: resolve close() deadlocks on frozen filesystems Aditya Srivastava
2026-06-16 5:38 ` [PATCH v5 1/2] xfs: add a XFS_TRANS_WRITECOUNT_TRYLOCK flag Aditya Srivastava
2026-06-16 5:38 ` [PATCH v5 2/2] xfs: prevent close() from hanging on frozen filesystems Aditya Srivastava
2026-06-16 13:04 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox