From: "Zhou, Yun" <yun.zhou@windriver.com>
To: Jan Kara <jack@suse.cz>
Cc: tytso@mit.edu, adilger.kernel@dilger.ca,
libaokun@linux.alibaba.com, ojaswin@linux.ibm.com,
ritesh.list@gmail.com, yi.zhang@huawei.com,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v7 3/4] ext4: introduce ext4_put_ea_inode() for safe deferred iput
Date: Fri, 19 Jun 2026 14:24:51 +0800 [thread overview]
Message-ID: <dd9e35e6-306b-4e49-9802-487ce7abd63c@windriver.com> (raw)
In-Reply-To: <jxcbsd2ot63wy3dcoximemkuitwoqn2a7jgxcsfdwaf5q3ecdu@sahahqqopo6y>
On 6/18/2026 2:42 AM, Jan Kara wrote:
> On Tue 16-06-26 23:15:57, Yun Zhou wrote:
>> +
>> + /* Deferred iput for EA inodes to avoid lock ordering issues */
>> + struct llist_head s_ea_inode_to_free;
>> + struct work_struct s_ea_inode_work;
>> +
>
> I'd probably use delayed work and schedule it with a delay of one jiffie so
> that some inodes can accumulate before we process them which should reduce
> the amount of task switching to workqueues.
>
Good idea, I will use delayed_work in next version.
>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>> index 6a77db4d3124..b777bb0a81ea 100644
>> --- a/fs/ext4/super.c
>> +++ b/fs/ext4/super.c
>> @@ -1308,6 +1308,9 @@ static void ext4_put_super(struct super_block *sb)
>> destroy_workqueue(sbi->rsv_conversion_wq);
>> ext4_release_orphan_info(sb);
>>
>> + /* Flush deferred EA inode iputs before destroying journal */
>> + flush_work(&sbi->s_ea_inode_work);
>> +
>
> This should happen earlier in ext4_put_super(). At this place quotas were
> already turned off and so quota accounting would go wrong.
That makes sense. I'll move it up to right before ext4_quotas_off().
>> +static void ext4_xattr_inode_array_free_deferred(struct super_block *sb,
>> + struct ext4_xattr_inode_array *array)
>
> The array of EA inodes used in xattr handling is just another mechanism
> used for delaying iput() of EA inodes. It doesn't make sense to stack these
> to one on top of another. Just completely replace the array mechanism with
> always deferring iput of EA inode into the workqueue.
>
I'm thinking that a complete replacement might be too large a change.
Should we consider postponing this work, or perhaps appending a new
patch to this series to handle it?
>
> Allocating ext4_ea_iput_entry for dropping each inode is somewhat wasteful.
> I want to suggest another scheme (somewhat more involved but more efficient
> scheme):
>
> 1) Create a VFS helper bool iput_if_not_last(struct inode *inode) which
> drops inode reference if it is not the last one (and returns true in that
> case). Basically:
>
> bool iput_if_not_last(struct inode *inode)
> {
> return atomic_add_unless(&inode->i_count, -1, 1);
> }
>
> This needs to be a separate patch as it should get vetting from VFS
> maintainers.
>
> 2) Use iput_if_not_last() in ext4_put_ea_inode(). If it returns true, we
> are done. Otherwise we know we were at least for a moment holders of the
> last inode reference, so we link the inode to the list of inodes to drop
> through llist_node embedded in ext4_inode_info. We cannot race with anybody
> else trying to link the same inode into the list because we hold one inode
> ref and so nobody else can hit this "I was holding the last ref" path.
> I'd union this llist_node say with xattr_sem which is unused for EA inodes
> to avoid growing ext4_inode_info.
>
> This way we avoid offloading unless really necessary and we don't have to
> do allocations just to drop EA inode ref.
>
Your idea makes a lot of sense. It greatly simplifies the current deferred
iput logic and eliminates the risk of failing to allocate an entry during
an OOM. However, as you mentioned, getting the VFS maintainers to agree
might be quite challenging.
BR,
Yun
next prev parent reply other threads:[~2026-06-19 6:25 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 15:15 [PATCH v7 0/4] ext4: fix xattr iput deadlock with s_writepages_rwsem Yun Zhou
2026-06-16 15:15 ` [PATCH v7 1/4] ext4: skip extra isize expansion during mount to prevent deadlock Yun Zhou
2026-06-16 15:15 ` [PATCH v7 2/4] ext4: set EXT4_STATE_NO_EXPAND in ext4_evict_inode Yun Zhou
2026-06-16 15:15 ` [PATCH v7 3/4] ext4: introduce ext4_put_ea_inode() for safe deferred iput Yun Zhou
2026-06-17 8:38 ` Zhou, Yun
2026-06-17 18:42 ` Jan Kara
2026-06-19 6:24 ` Zhou, Yun [this message]
2026-06-22 8:32 ` Jan Kara
2026-06-22 8:47 ` Zhou, Yun
2026-06-22 10:06 ` Zhou, Yun
2026-06-22 10:44 ` Jan Kara
2026-06-16 15:15 ` [PATCH v7 4/4] ext4: convert remaining EA inode iput() calls to ext4_put_ea_inode() Yun Zhou
2026-06-17 18:13 ` [PATCH v7 0/4] ext4: fix xattr iput deadlock with s_writepages_rwsem Jan Kara
2026-06-18 0:24 ` Zhou, Yun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dd9e35e6-306b-4e49-9802-487ce7abd63c@windriver.com \
--to=yun.zhou@windriver.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=libaokun@linux.alibaba.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=tytso@mit.edu \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox