* [PATCH] xfs: prevent close() from hanging on frozen filesystems
@ 2026-06-10 13:13 Aditya Srivastava
2026-06-10 13:28 ` Christoph Hellwig
0 siblings, 1 reply; 2+ messages in thread
From: Aditya Srivastava @ 2026-06-10 13:13 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: linux-xfs, linux-kernel, Aditya Prakash Srivastava
From: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
When a file with active speculative post-EOF preallocations is closed,
xfs_file_release() synchronously triggers xfs_free_eofblocks() to clean
them up. This requires allocating a write transaction (xfs_trans_alloc),
which blocks indefinitely if the filesystem is currently frozen or in the
process of freezing, as it waits to acquire the superblock's write lock.
As a result, a close() system call on a read-write file descriptor can
hang indefinitely in percpu_rwsem_wait() until the filesystem is thawed,
even if the file is closed by a non-writer process or after all writing
activity has already ceased.
This issue has been seen across multiple downstream environments and has a
long history of causing severe system disruption. For example:
- Downstream Red Hat Bugzilla 1474726 (dating back to 2017) details
complete system hangs during system backups when rsync and fsfreeze
are used. Even seemingly harmless read-only commands like
'cat /var/log/messages' would hang on close() in __sb_start_write
via xfs_free_eofblocks, requiring a hard reboot.
- Downstream LeApp integration test scenarios (e.g. systemd-rsync migration
checks) consistently hit this hang when trying to freeze the system.
Historically, XFS maintainers dismissed this behavior as NOTABUG, claiming
that close() is not a read-only operation and is expected to block since it
allocates write transactions. However, this behavior is highly disruptive.
User-space applications view close() as a resource reclamation system call,
not a write operation, and do not expect it to block. Hanging on close()
frequently triggers container healthcheck failures, systemd service
timeouts, and cluster failover cascades.
Additionally, no other major Linux filesystem (such as ext4 or btrfs)
synchronously allocates write transactions during close() system calls,
making this hang a highly unexpected and disruptive behavior unique to XFS.
We can safely skip this post-EOF cleanup optimization during a filesystem
freeze because:
1. Speculative preallocation is purely a performance heuristic to prevent
fragmentation, not a requirement for file correctness or metadata
consistency. The frozen snapshot remains completely consistent and safe,
regardless of whether these post-EOF blocks are freed before or
after thaw.
2. No space is permanently leaked. Any skipped speculative preallocations
are safely preserved and will be scanned and reclaimed automatically by
the background block garbage collection (blockgc) workers once the
filesystem is thawed.
3. Precedent already exists in xfs_file_release() to skip this truncation:
it already uses xfs_ilock_nowait() and silently skips the cleanup if
the lock cannot be acquired, relying on background or future cleanup to
avoid mmdeadlocks. Skipping under fsfreeze is highly consistent with
this existing design.
Note that background blockgc and inodegc workers are already explicitly
stopped during freeze (via xfs_blockgc_stop() and xfs_inodegc_stop()),
leaving the synchronous xfs_file_release() path as the sole remaining
unblocked path that could attempt write transactions on a frozen
filesystem.
Fix this hang by checking if the filesystem is writable at the
SB_FREEZE_WRITE level in xfs_file_release() and returning early if it
is frozen or freezing.
A simple C reproducer demonstrating the hang (compile with -pthread):
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <pthread.h>
#include <sys/ioctl.h>
#include <sys/vfs.h>
#include <linux/fs.h>
#include <libgen.h>
volatile int close_started = 0;
volatile int close_completed = 0;
void *close_thread(void *arg) {
int fd = *(int *)arg;
close_started = 1;
close(fd);
close_completed = 1;
return NULL;
}
int main(int argc, char *argv[]) {
struct statfs sfs;
statfs(argv[1], &sfs);
if (sfs.f_type != 0x58465342) return 1;
int freeze_fd = open(dirname(strdup(argv[1])), O_RDONLY);
int write_fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0644);
char buf[65536] = {0};
for (int i = 0; i < 320; i++) write(write_fd, buf, sizeof(buf));
ioctl(freeze_fd, FIFREEZE, 0);
pthread_t thread;
pthread_create(&thread, NULL, close_thread, &write_fd);
while (!close_started) usleep(1000);
usleep(1000000); // Wait 1s
if (!close_completed) printf("SUCCESS: close() hung!\\n");
ioctl(freeze_fd, FITHAW, 0);
pthread_join(thread, NULL);
unlink(argv[1]);
return 0;
}
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205833
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1474726
Signed-off-by: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
---
fs/xfs/xfs_file.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 845a97c9b063..401403e066c9 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1798,6 +1798,15 @@ xfs_file_release(
xfs_is_zoned_inode(ip))
return 0;
+ /*
+ * If the filesystem is frozen or freezing, don't trigger transactions
+ * that would block close() indefinitely. Background block garbage
+ * collection will clean up these speculative preallocations once
+ * the filesystem thaws.
+ */
+ if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
+ return 0;
+
/*
* If we can't get the iolock just skip truncating the blocks past EOF
* because we could deadlock with the mmap_lock otherwise. We'll get
--
2.47.3
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] xfs: prevent close() from hanging on frozen filesystems
2026-06-10 13:13 [PATCH] xfs: prevent close() from hanging on frozen filesystems Aditya Srivastava
@ 2026-06-10 13:28 ` Christoph Hellwig
0 siblings, 0 replies; 2+ messages in thread
From: Christoph Hellwig @ 2026-06-10 13:28 UTC (permalink / raw)
To: Aditya Srivastava; +Cc: Carlos Maiolino, linux-xfs, linux-kernel
On Wed, Jun 10, 2026 at 01:13:41PM +0000, Aditya Srivastava wrote:
> A simple C reproducer demonstrating the hang (compile with -pthread):
Can you contribute this under the GPL or a compatible license, and
maybe even wire it up to xfstests?
> + /*
> + * If the filesystem is frozen or freezing, don't trigger transactions
> + * that would block close() indefinitely. Background block garbage
> + * collection will clean up these speculative preallocations once
> + * the filesystem thaws.
> + */
> + if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
> + return 0;
Note that this is still racy as the freeze could come in right after
this check. Basically what we'd need to fix this properly is a flag
to xfs_trans_alloc that uses sb_start_intwrite_trylock when set, and
returns a suitable error case in that case, which we'd then use to
unwind safely from release.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-10 13:28 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 13:13 [PATCH] xfs: prevent close() from hanging on frozen filesystems Aditya Srivastava
2026-06-10 13:28 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.