Linux EXT4 FS development
 help / color / mirror / Atom feed
From: Yun Zhou <yun.zhou@windriver.com>
To: <tytso@mit.edu>, <adilger.kernel@dilger.ca>,
	<libaokun@linux.alibaba.com>, <jack@suse.cz>,
	<ojaswin@linux.ibm.com>, <ritesh.list@gmail.com>,
	<yi.zhang@huawei.com>, <viro@zeniv.linux.org.uk>,
	<brauner@kernel.org>
Cc: <linux-fsdevel@vger.kernel.org>, <linux-ext4@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <yun.zhou@windriver.com>
Subject: [PATCH v9 0/4] ext4: deferred iput framework for EA inodes
Date: Tue, 23 Jun 2026 16:52:39 +0800	[thread overview]
Message-ID: <20260623085243.2816425-1-yun.zhou@windriver.com> (raw)
In-Reply-To: <20260623083540.2744885-5-yun.zhou@windriver.com>

This series introduces a deferred-iput framework for EA inodes to
eliminate a class of lock ordering issues in ext4 xattr code.

The problem: iput() on EA inodes while holding xattr_sem or a jbd2
handle can trigger eviction, which may acquire those same locks or
s_writepages_rwsem, creating circular dependencies.  The immediate
deadlock (during mount-time orphan cleanup) is fixed by two separate
patches already reviewed and posted:

  ext4: skip extra isize expansion during mount to prevent deadlock
  ext4: set EXT4_STATE_NO_EXPAND in ext4_evict_inode

This series provides the structural fix that makes the code safe
regardless of calling context:

Patch 1 adds a VFS helper iput_if_not_last() which drops an inode
reference only if it is not the last one, using atomic_add_unless().
This provides a proper VFS abstraction for filesystems that need to
conditionally defer final iput.

Patch 2 introduces ext4_put_ea_inode() using iput_if_not_last() as
a fast path (single atomic, zero overhead for the common case).  If
this is the last reference, the inode is linked onto a per-sb llist
(via i_ea_iput_node embedded in ext4_inode_info, union with xattr_sem
which is unused for EA inodes) and a delayed worker (1 jiffie) performs
the final iput() in a clean context.  No per-iput allocation needed.
Also moves init_rwsem(xattr_sem) from init_once to ext4_alloc_inode
to handle slab reuse after the union field has been overwritten.

Patch 3 converts all EA inode iput() calls in xattr code to use
ext4_put_ea_inode() uniformly -- no exceptions to reason about.

Patch 4 removes the now-redundant ea_inode_array mechanism (parameter
threading, struct, expand/free functions), replaced entirely by direct
ext4_put_ea_inode() calls.  This is a net code reduction.

Link: https://syzkaller.appspot.com/bug?extid=5d19358d7eb30ffb0cc5

v9:
 - Add iput_if_not_last() as proper VFS helper (per reviewer: don't
   let filesystems manipulate inode refcount without VFS abstraction).
 - Use iput_if_not_last() + llist_node embedded in ext4_inode_info
   (union with xattr_sem) to avoid per-iput allocation entirely.
 - Convert ALL EA inode iput() calls uniformly -- no exceptions.
 - Remove entire ea_inode_array mechanism.
 - Add WARN_ON_ONCE in ext4_put_ea_inode() to catch misuse on non-EA
   inodes (protects the xattr_sem union safety).
 - Fix worker re-arm: ext4_drain_ea_inode_work() loops to handle
   nested EA inode evictions re-scheduling work.
 - Move INIT_DELAYED_WORK before journal loading (fast commit replay
   may trigger evictions).
 - Drain before ext4_quotas_off() for correct quota accounting.
 - Add flush in failed_mount_wq and failed_mount3a error paths for
   journal replay case.
 - Move init_rwsem(xattr_sem) from init_once to ext4_alloc_inode to
   handle slab object reuse after union overwrite.
 - Encapsulate worker init into ext4_init_ea_inode_work(), making
   ext4_ea_inode_work() static to xattr.c.

(Resending with VFS maintainers Cc'd -- no code changes from initial posting)

Yun Zhou (4):
  fs: add iput_if_not_last() helper
  ext4: introduce ext4_put_ea_inode() for safe deferred iput
  ext4: convert all EA inode iput() calls to ext4_put_ea_inode()
  ext4: remove ea_inode_array mechanism in favor of ext4_put_ea_inode()

 fs/ext4/ext4.h     |  13 ++++-
 fs/ext4/inode.c    |   6 +-
 fs/ext4/super.c    |  18 +++++-
 fs/ext4/xattr.c    | 150 ++++++++++++++++++++-------------------------
 fs/ext4/xattr.h    |  21 ++++---
 include/linux/fs.h |  13 ++++
 6 files changed, 125 insertions(+), 96 deletions(-)

-- 
2.43.0


  reply	other threads:[~2026-06-23  8:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  8:35 [PATCH v9 0/4] ext4: deferred iput framework for EA inodes Yun Zhou
2026-06-23  8:35 ` [PATCH v9 1/4] fs: add iput_if_not_last() helper Yun Zhou
2026-06-23  8:35 ` [PATCH v9 2/4] ext4: introduce ext4_put_ea_inode() for safe deferred iput Yun Zhou
2026-06-23  8:35 ` [PATCH v9 3/4] ext4: convert all EA inode iput() calls to ext4_put_ea_inode() Yun Zhou
2026-06-23  8:35 ` [PATCH v9 4/4] ext4: remove ea_inode_array mechanism in favor of ext4_put_ea_inode() Yun Zhou
2026-06-23  8:52   ` Yun Zhou [this message]
2026-06-23  8:52     ` [PATCH v9 1/4] fs: add iput_if_not_last() helper Yun Zhou
2026-06-23  8:52     ` [PATCH v9 2/4] ext4: introduce ext4_put_ea_inode() for safe deferred iput Yun Zhou
2026-06-23  8:52     ` [PATCH v9 3/4] ext4: convert all EA inode iput() calls to ext4_put_ea_inode() Yun Zhou
2026-06-23  8:52     ` [PATCH v9 4/4] ext4: remove ea_inode_array mechanism in favor of ext4_put_ea_inode() Yun Zhou
2026-06-23 13:13 ` [syzbot ci] Re: ext4: deferred iput framework for EA inodes syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623085243.2816425-1-yun.zhou@windriver.com \
    --to=yun.zhou@windriver.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=libaokun@linux.alibaba.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox