* [PATCH v4] ext4: fix ABBA deadlock in ext4_xattr_inode_cache_find()
@ 2026-06-25 6:03 Aditya Srivastava
0 siblings, 0 replies; only message in thread
From: Aditya Srivastava @ 2026-06-25 6:03 UTC (permalink / raw)
To: tytso, jack
Cc: adilger.kernel, libaokun, ritesh.list, yi.zhang, linux-ext4,
linux-kernel, Aditya Prakash Srivastava, Colin Ian King
From: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
Syzbot/stress-ng reported an ABBA deadlock in ext4 when exercising
concurrent xattr workloads (using the ea_inode mount/format option).
The deadlock occurs between the running transaction and the eviction
thread:
- Task 1 (stress-ng): Holds a reference to a shared mbcache_entry (ce)
and calls ext4_xattr_inode_cache_find() -> ext4_iget() to retrieve
the corresponding EA inode. Since the EA inode is currently being
evicted, ext4_iget() blocks in __wait_on_freeing_inode() waiting for
eviction to complete.
- Task 2 (eviction thread): Currently evicting the same EA inode in
ext4_evict_ea_inode(). It calls mb_cache_entry_wait_unused(oe) which
blocks waiting for Task 1 to release the reference to the mbcache_entry.
To break this deadlock, implement a new ext4_iget() configuration flag
named EXT4_IGET_NOWAIT. When set, perform a non-blocking lookup of the
inode via VFS's find_inode_nowait() API.
If the inode is currently being evicted (marked with I_FREEING or
I_WILL_FREE) or created (I_CREATING), simply skip it (returning -ESTALE)
rather than waiting for eviction/creation to complete, breaking the ABBA
cycle. If the returned inode is I_NEW, wait for its initialization to
clear via wait_on_new_inode().
If initialization fails and the inode is unhashed during the waking up of
wait_on_new_inode() (e.g., due to an I/O read error in another thread),
safely drop the reference and return -ESTALE to cleanly bypass the xattr
cache entry. Finally, standard validation checks (including is_bad_inode,
EXT4_EA_INODE_FL, file_acl, and xattr flags) are executed as normal inside
check_igot_inode() to fully guarantee VFS-layer safety.
In ext4_xattr_inode_cache_find(), invoke ext4_iget() with the new
EXT4_IGET_NOWAIT flag to perform the non-blocking cache search.
Suggested-by: Jan Kara <jack@suse.cz>
Reported-by: Colin Ian King <colin.i.king@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219283
Fixes: 0a46ef234756 ("ext4: do not create EA inode under buffer lock")
Signed-off-by: Aditya Prakash Srivastava <aditya.ansh182@gmail.com>
---
Changes in v4:
- Check if the inode was unhashed during wait_on_new_inode() waking up
to handle transient initialization failures (like I/O read errors)
gracefully. Dropping the reference and returning -ESTALE prevents
false filesystem corruption errors (__ext4_error), as found by the
Sashiko AI bot.
Changes in v3:
- Implement a new ext4_iget() configuration flag named EXT4_IGET_NOWAIT to
fully contain the non-blocking lookup and VFS-level validations within
inode.c, as requested by Jan Kara.
- Skip inodes currently being created (I_CREATING), following Jan Kara's
direct feedback.
- Remove all open-coded match helpers and VFS state-checks from xattr.c.
Changes in v2:
- Read inode state locklessly using inode_state_read_once() to resolve
a lockdep assertion on cache hit.
- Manually restore essential inode/ea_inode validations on the retrieved
inode (is_bad_inode, EXT4_EA_INODE_FL, file_acl, and xattr checks) to
match VFS safety guarantees and prevent using corrupted/failed inodes.
fs/ext4/ext4.h | 3 ++-
fs/ext4/inode.c | 46 +++++++++++++++++++++++++++++++++++++++++++---
fs/ext4/xattr.c | 2 +-
3 files changed, 46 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index b37c136ea3ab..c76dd0bdd3d8 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3144,7 +3144,8 @@ typedef enum {
EXT4_IGET_SPECIAL = 0x0001, /* OK to iget a system inode */
EXT4_IGET_HANDLE = 0x0002, /* Inode # is from a handle */
EXT4_IGET_BAD = 0x0004, /* Allow to iget a bad inode */
- EXT4_IGET_EA_INODE = 0x0008 /* Inode should contain an EA value */
+ EXT4_IGET_EA_INODE = 0x0008, /* Inode should contain an EA value */
+ EXT4_IGET_NOWAIT = 0x0010 /* Non-blocking lookup (skip if freeing) */
} ext4_iget_flags;
extern struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ce99807c5f5b..75ed467f5abf 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5270,6 +5270,24 @@ void ext4_set_inode_mapping_order(struct inode *inode)
mapping_set_folio_order_range(inode->i_mapping, min_order, max_order);
}
+static int ext4_iget_match(struct inode *inode, u64 ino, void *data)
+{
+ bool *is_freeing = data;
+
+ if (inode->i_ino != ino)
+ return 0;
+ spin_lock(&inode->i_lock);
+ if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_CREATING)) {
+ if (is_freeing)
+ *is_freeing = true;
+ spin_unlock(&inode->i_lock);
+ return -1;
+ }
+ __iget(inode);
+ spin_unlock(&inode->i_lock);
+ return 1;
+}
+
struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
ext4_iget_flags flags, const char *function,
unsigned int line)
@@ -5298,9 +5316,31 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
return ERR_PTR(-EFSCORRUPTED);
}
- inode = iget_locked(sb, ino);
- if (!inode)
- return ERR_PTR(-ENOMEM);
+ if (flags & EXT4_IGET_NOWAIT) {
+ bool is_freeing = false;
+
+ inode = find_inode_nowait(sb, ino, ext4_iget_match, &is_freeing);
+ if (is_freeing)
+ return ERR_PTR(-ESTALE);
+ if (!inode) {
+ inode = iget_locked(sb, ino);
+ if (!inode)
+ return ERR_PTR(-ENOMEM);
+ } else {
+ if (inode_state_read_once(inode) & I_NEW) {
+ wait_on_new_inode(inode);
+ if (unlikely(inode_unhashed(inode))) {
+ iput(inode);
+ return ERR_PTR(-ESTALE);
+ }
+ }
+ }
+ } else {
+ inode = iget_locked(sb, ino);
+ if (!inode)
+ return ERR_PTR(-ENOMEM);
+ }
+
if (!(inode_state_read_once(inode) & I_NEW)) {
ret = check_igot_inode(inode, flags, function, line);
if (ret) {
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 982a1f831e22..21b5670d8503 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1550,7 +1550,7 @@ ext4_xattr_inode_cache_find(struct inode *inode, const void *value,
while (ce) {
ea_inode = ext4_iget(inode->i_sb, ce->e_value,
- EXT4_IGET_EA_INODE);
+ EXT4_IGET_EA_INODE | EXT4_IGET_NOWAIT);
if (IS_ERR(ea_inode))
goto next_entry;
ext4_xattr_inode_set_class(ea_inode);
--
2.47.3
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2026-06-25 6:03 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-25 6:03 [PATCH v4] ext4: fix ABBA deadlock in ext4_xattr_inode_cache_find() Aditya Srivastava
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox