linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Josef Bacik <josef@toxicpanda.com>
Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
	 kernel-team@fb.com, linux-ext4@vger.kernel.org,
	linux-xfs@vger.kernel.org,  viro@zeniv.linux.org.uk
Subject: Re: [PATCH 16/50] fs: change evict_inodes to use iput instead of evict directly
Date: Mon, 25 Aug 2025 11:07:55 +0200	[thread overview]
Message-ID: <20250825-entbinden-kehle-2e1f8b67b190@brauner> (raw)
In-Reply-To: <1198cd4cd35c5875fbf95dc3dca68650bb176bb1.1755806649.git.josef@toxicpanda.com>

On Thu, Aug 21, 2025 at 04:18:27PM -0400, Josef Bacik wrote:
> At evict_inodes() time, we no longer have SB_ACTIVE set, so we can
> easily go through the normal iput path to clear any inodes. Update

I'm a bit lost why SB_ACTIVE is used here as a justification to call
iput(). I think it's because iput_final() would somehow add it back to
the LRU if SB_ACTIVE was still set and the filesystem somehow would
indicate it wouldn't want to drop the inode.

I'm confused where that would even happen. IOW, which filesystem would
indicate "don't drop the inode" even though it's about to vanish. But
anyway, that's probably not important because...

> dispose_list() to check how we need to free the inode, and then grab a
> full reference to the inode while we're looping through the remaining
> inodes, and simply iput them at the end.
> 
> Since we're just calling iput we don't really care about the i_count on
> the inode at the current time.  Remove the i_count checks and just call
> iput on every inode we find.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/inode.c | 26 +++++++++++---------------
>  1 file changed, 11 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 72981b890ec6..80ad327746a7 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -933,7 +933,7 @@ static void evict(struct inode *inode)
>   * Dispose-list gets a local list with local inodes in it, so it doesn't
>   * need to worry about list corruption and SMP locks.
>   */
> -static void dispose_list(struct list_head *head)
> +static void dispose_list(struct list_head *head, bool for_lru)
>  {
>  	while (!list_empty(head)) {
>  		struct inode *inode;
> @@ -941,8 +941,12 @@ static void dispose_list(struct list_head *head)
>  		inode = list_first_entry(head, struct inode, i_lru);
>  		list_del_init(&inode->i_lru);
>  
> -		evict(inode);
> -		iobj_put(inode);
> +		if (for_lru) {
> +			evict(inode);
> +			iobj_put(inode);
> +		} else {
> +			iput(inode);
> +		}

... Afaict, if we end up in dispose_list() we came from one of two
locations:

(1) prune_icache_sb()
    In which case inode_lru_isolate() will have only returned inodes
    that prior to your changes would have inode->i_count zero.

(2) evict_inodes()
    Similar story, this only hits inodes with inode->i_count zero.

With your change you're adding an increment from zero for (2) via
__iget() so that you always end up with a full refcount, and that is
backing your changes to dispose_list() later.

I don't see the same done for (1) though and so your later call to
iput() drops the reference below zero? It's accidently benign because
iiuc atomic_dec_and_test() will simply tell you that reference count
didn't go to zero and so iput() will back off. But still this should be
fixed if I'm right.

The conversion to iput() is introducing a lot of subtlety in the middle
of the series. If I'm right then the iput() is a always a nop because in
all cases it was an increment from zero. But it isn't really a nop
because we still do stuff like call ->drop_inode() again. Maybe it's
fine because no filesystem would have issues with this but I wouldn't
count on it and also it feels rather unclean to do it this way.

So, under the assumption, that after the increment from zero you did, we
really only have a blatant zombie inode on our hands and we only need to
get rid of the i_count we took make that explicit and do:

	if (for_lru) {
		evict(inode);
		iobj_put(inode);
	} else {
		/* This inode was always incremented from zero.
		 * Get rid of that reference without doing anything else.
		 */
		WARN_ON_ONCE(!atomic_dec_and_test(&inode->i_count));
	}

Btw, for the iobj_put() above, I assume that we're not guaranteed that
i_obj_count == 1?

  reply	other threads:[~2025-08-25  9:08 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-21 20:18 [PATCH 00/50] fs: rework inode reference counting Josef Bacik
2025-08-21 20:18 ` [PATCH 01/50] fs: add an i_obj_count refcount to the inode Josef Bacik
2025-08-21 20:18 ` [PATCH 02/50] fs: make the i_state flags an enum Josef Bacik
2025-08-22 11:08   ` Christian Brauner
2025-08-22 13:31     ` Josef Bacik
2025-08-22 14:36     ` David Sterba
2025-08-22 11:18   ` Sun YangKai
2025-08-22 11:42     ` [PATCH 02/50] " Alan Huang
2025-08-22 12:11       ` Sun YangKai
2025-08-22 14:40     ` [PATCH 02/50] fs: " Josef Bacik
2025-08-21 20:18 ` [PATCH 03/50] fs: hold an i_obj_count reference in wait_sb_inodes Josef Bacik
2025-08-21 20:18 ` [PATCH 04/50] fs: hold an i_obj_count reference for the i_wb_list Josef Bacik
2025-08-22 11:27   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 05/50] fs: hold an i_obj_count reference for the i_io_list Josef Bacik
2025-08-21 20:18 ` [PATCH 06/50] fs: hold an i_obj_count reference in writeback_sb_inodes Josef Bacik
2025-08-22 12:20   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 07/50] fs: hold an i_obj_count reference while on the hashtable Josef Bacik
2025-08-21 20:18 ` [PATCH 08/50] fs: hold an i_obj_count reference while on the LRU list Josef Bacik
2025-08-21 20:18 ` [PATCH 09/50] fs: hold an i_obj_count reference while on the sb inode list Josef Bacik
2025-08-21 20:18 ` [PATCH 10/50] fs: stop accessing ->i_count directly in f2fs and gfs2 Josef Bacik
2025-08-22 12:38   ` (subset) " Christian Brauner
2025-08-21 20:18 ` [PATCH 11/50] fs: hold an i_obj_count when we have an i_count reference Josef Bacik
2025-08-21 20:18 ` [PATCH 12/50] fs: rework iput logic Josef Bacik
2025-08-22 12:54   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 13/50] fs: add an I_LRU flag to the inode Josef Bacik
2025-08-21 20:18 ` [PATCH 14/50] fs: maintain a list of pinned inodes Josef Bacik
2025-08-22 14:55   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 15/50] fs: delete the inode from the LRU list on lookup Josef Bacik
2025-08-22 15:27   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 16/50] fs: change evict_inodes to use iput instead of evict directly Josef Bacik
2025-08-25  9:07   ` Christian Brauner [this message]
2025-08-25 19:35     ` Josef Bacik
2025-08-26  9:56       ` Christian Brauner
2025-08-21 20:18 ` [PATCH 17/50] fs: hold a full ref while the inode is on a LRU Josef Bacik
2025-08-25  9:20   ` Christian Brauner
2025-08-25 10:40     ` Christian Brauner
2025-08-21 20:18 ` [PATCH 18/50] fs: disallow 0 reference count inodes Josef Bacik
2025-08-25 10:54   ` Christian Brauner
2025-08-25 19:26     ` Josef Bacik
2025-08-26  9:28       ` Christian Brauner
2025-08-21 20:18 ` [PATCH 19/50] fs: make evict_inodes add to the dispose list under the i_lock Josef Bacik
2025-08-21 20:18 ` [PATCH 20/50] fs: convert i_count to refcount_t Josef Bacik
2025-08-22 12:10   ` Amir Goldstein
2025-08-22 13:56   ` kernel test robot
2025-08-25 11:03   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 21/50] fs: use refcount_inc_not_zero in igrab Josef Bacik
2025-08-25 11:21   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 22/50] fs: use inode_tryget in find_inode* Josef Bacik
2025-08-25 11:26   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 23/50] fs: update find_inode_*rcu to check the i_count count Josef Bacik
2025-08-25 11:27   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 24/50] fs: use igrab in insert_inode_locked Josef Bacik
2025-08-21 20:18 ` [PATCH 25/50] fs: remove I_WILL_FREE|I_FREEING check from __inode_add_lru Josef Bacik
2025-08-21 20:18 ` [PATCH 26/50] fs: remove I_WILL_FREE|I_FREEING check in inode_pin_lru_isolating Josef Bacik
2025-08-21 20:18 ` [PATCH 27/50] fs: use inode_tryget in evict_inodes Josef Bacik
2025-08-25 11:43   ` Christian Brauner
2025-08-25 18:22     ` Josef Bacik
2025-08-21 20:18 ` [PATCH 28/50] fs: change evict_dentries_for_decrypted_inodes to use refcount Josef Bacik
2025-08-21 20:18 ` [PATCH 29/50] block: use igrab in sync_bdevs Josef Bacik
2025-08-21 20:18 ` [PATCH 30/50] bcachefs: use the refcount instead of I_WILL_FREE|I_FREEING Josef Bacik
2025-08-21 20:18 ` [PATCH 31/50] btrfs: don't check I_WILL_FREE|I_FREEING Josef Bacik
2025-08-21 20:18 ` [PATCH 32/50] fs: use igrab in drop_pagecache_sb Josef Bacik
2025-08-21 20:18 ` [PATCH 33/50] fs: stop checking I_FREEING in d_find_alias_rcu Josef Bacik
2025-08-21 20:18 ` [PATCH 34/50] ext4: stop checking I_WILL_FREE|IFREEING in ext4_check_map_extents_env Josef Bacik
2025-08-21 20:18 ` [PATCH 35/50] fs: remove I_WILL_FREE|I_FREEING from fs-writeback.c Josef Bacik
2025-08-25 11:46   ` Christian Brauner
2025-08-21 20:18 ` [PATCH 36/50] gfs2: remove I_WILL_FREE|I_FREEING usage Josef Bacik
2025-08-21 20:18 ` [PATCH 37/50] fs: remove I_WILL_FREE|I_FREEING check from dquot.c Josef Bacik
2025-08-21 20:18 ` [PATCH 38/50] notify: remove I_WILL_FREE|I_FREEING checks in fsnotify_unmount_inodes Josef Bacik
2025-08-21 20:18 ` [PATCH 39/50] xfs: remove I_FREEING check Josef Bacik
2025-08-21 20:18 ` [PATCH 40/50] landlock: remove I_FREEING|I_WILL_FREE check Josef Bacik
2025-08-21 20:18 ` [PATCH 41/50] fs: change inode_is_dirtytime_only to use refcount Josef Bacik
2025-08-21 20:18 ` [PATCH 42/50] btrfs: remove references to I_FREEING Josef Bacik
2025-08-21 20:18 ` [PATCH 43/50] ext4: remove reference to I_FREEING in inode.c Josef Bacik
2025-08-21 20:18 ` [PATCH 44/50] ext4: remove reference to I_FREEING in orphan.c Josef Bacik
2025-08-21 20:18 ` [PATCH 45/50] pnfs: use i_count refcount to determine if the inode is going away Josef Bacik
2025-08-21 20:18 ` [PATCH 46/50] fs: remove some spurious I_FREEING references in inode.c Josef Bacik
2025-08-21 20:18 ` [PATCH 47/50] xfs: remove reference to I_FREEING|I_WILL_FREE Josef Bacik
2025-08-21 20:18 ` [PATCH 48/50] ocfs2: do not set I_WILL_FREE Josef Bacik
2025-08-21 20:19 ` [PATCH 49/50] fs: remove I_FREEING|I_WILL_FREE Josef Bacik
2025-08-25 11:53   ` Christian Brauner
2025-08-21 20:19 ` [PATCH 50/50] fs: add documentation explaining the reference count rules for inodes Josef Bacik
2025-08-25 11:56   ` Christian Brauner
2025-08-22 10:51 ` [PATCH 00/50] fs: rework inode reference counting Christian Brauner
2025-08-22 13:30   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250825-entbinden-kehle-2e1f8b67b190@brauner \
    --to=brauner@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).