From: Josef Bacik <josef@toxicpanda.com>
To: Christian Brauner <brauner@kernel.org>
Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org,
kernel-team@fb.com, linux-ext4@vger.kernel.org,
linux-xfs@vger.kernel.org, viro@zeniv.linux.org.uk
Subject: Re: [PATCH 02/50] fs: make the i_state flags an enum
Date: Fri, 22 Aug 2025 09:31:51 -0400 [thread overview]
Message-ID: <20250822133151.GB927384@perftesting> (raw)
In-Reply-To: <20250822-orcas-bemannten-728c9946b160@brauner>
On Fri, Aug 22, 2025 at 01:08:07PM +0200, Christian Brauner wrote:
> On Thu, Aug 21, 2025 at 04:18:13PM -0400, Josef Bacik wrote:
> > Adjusting i_state flags always means updating the values manually. Bring
> > these forward into the 2020's and make a nice clean macro for defining
> > the i_state values as an enum, providing __ variants for the cases where
> > we need the bit position instead of the actual value, and leaving the
> > actual NAME as the 1U << bit value.
> >
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > ---
> > include/linux/fs.h | 234 +++++++++++++++++++++++----------------------
> > 1 file changed, 122 insertions(+), 112 deletions(-)
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 9a1ce67eed33..e741dc453c2c 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -665,6 +665,127 @@ is_uncached_acl(struct posix_acl *acl)
> > #define IOP_MGTIME 0x0020
> > #define IOP_CACHED_LINK 0x0040
> >
> > +/*
> > + * Inode state bits. Protected by inode->i_lock
> > + *
> > + * Four bits determine the dirty state of the inode: I_DIRTY_SYNC,
> > + * I_DIRTY_DATASYNC, I_DIRTY_PAGES, and I_DIRTY_TIME.
> > + *
> > + * Four bits define the lifetime of an inode. Initially, inodes are I_NEW,
> > + * until that flag is cleared. I_WILL_FREE, I_FREEING and I_CLEAR are set at
> > + * various stages of removing an inode.
> > + *
> > + * Two bits are used for locking and completion notification, I_NEW and I_SYNC.
> > + *
> > + * I_DIRTY_SYNC Inode is dirty, but doesn't have to be written on
> > + * fdatasync() (unless I_DIRTY_DATASYNC is also set).
> > + * Timestamp updates are the usual cause.
> > + * I_DIRTY_DATASYNC Data-related inode changes pending. We keep track of
> > + * these changes separately from I_DIRTY_SYNC so that we
> > + * don't have to write inode on fdatasync() when only
> > + * e.g. the timestamps have changed.
> > + * I_DIRTY_PAGES Inode has dirty pages. Inode itself may be clean.
> > + * I_DIRTY_TIME The inode itself has dirty timestamps, and the
> > + * lazytime mount option is enabled. We keep track of this
> > + * separately from I_DIRTY_SYNC in order to implement
> > + * lazytime. This gets cleared if I_DIRTY_INODE
> > + * (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. But
> > + * I_DIRTY_TIME can still be set if I_DIRTY_SYNC is already
> > + * in place because writeback might already be in progress
> > + * and we don't want to lose the time update
> > + * I_NEW Serves as both a mutex and completion notification.
> > + * New inodes set I_NEW. If two processes both create
> > + * the same inode, one of them will release its inode and
> > + * wait for I_NEW to be released before returning.
> > + * Inodes in I_WILL_FREE, I_FREEING or I_CLEAR state can
> > + * also cause waiting on I_NEW, without I_NEW actually
> > + * being set. find_inode() uses this to prevent returning
> > + * nearly-dead inodes.
> > + * I_WILL_FREE Must be set when calling write_inode_now() if i_count
> > + * is zero. I_FREEING must be set when I_WILL_FREE is
> > + * cleared.
> > + * I_FREEING Set when inode is about to be freed but still has dirty
> > + * pages or buffers attached or the inode itself is still
> > + * dirty.
> > + * I_CLEAR Added by clear_inode(). In this state the inode is
> > + * clean and can be destroyed. Inode keeps I_FREEING.
> > + *
> > + * Inodes that are I_WILL_FREE, I_FREEING or I_CLEAR are
> > + * prohibited for many purposes. iget() must wait for
> > + * the inode to be completely released, then create it
> > + * anew. Other functions will just ignore such inodes,
> > + * if appropriate. I_NEW is used for waiting.
> > + *
> > + * I_SYNC Writeback of inode is running. The bit is set during
> > + * data writeback, and cleared with a wakeup on the bit
> > + * address once it is done. The bit is also used to pin
> > + * the inode in memory for flusher thread.
> > + *
> > + * I_REFERENCED Marks the inode as recently references on the LRU list.
> > + *
> > + * I_WB_SWITCH Cgroup bdi_writeback switching in progress. Used to
> > + * synchronize competing switching instances and to tell
> > + * wb stat updates to grab the i_pages lock. See
> > + * inode_switch_wbs_work_fn() for details.
> > + *
> > + * I_OVL_INUSE Used by overlayfs to get exclusive ownership on upper
> > + * and work dirs among overlayfs mounts.
> > + *
> > + * I_CREATING New object's inode in the middle of setting up.
> > + *
> > + * I_DONTCACHE Evict inode as soon as it is not used anymore.
> > + *
> > + * I_SYNC_QUEUED Inode is queued in b_io or b_more_io writeback lists.
> > + * Used to detect that mark_inode_dirty() should not move
> > + * inode between dirty lists.
> > + *
> > + * I_PINNING_FSCACHE_WB Inode is pinning an fscache object for writeback.
> > + *
> > + * I_LRU_ISOLATING Inode is pinned being isolated from LRU without holding
> > + * i_count.
> > + *
> > + * Q: What is the difference between I_WILL_FREE and I_FREEING?
> > + *
> > + * __I_{SYNC,NEW,LRU_ISOLATING} are used to derive unique addresses to wait
> > + * upon. There's one free address left.
> > + */
> > +
> > +/*
> > + * As simple macro to define the inode state bits, __NAME will be the bit value
> > + * (0, 1, 2, ...), and NAME will be the bit mask (1U << __NAME). The __NAME_SEQ
> > + * is used to reset the sequence number so the next name gets the next bit value
> > + * in the sequence.
> > + */
> > +#define INODE_BIT(name) \
> > + __ ## name, \
> > + name = (1U << __ ## name), \
> > + __ ## name ## _SEQ = __ ## name
>
> I'm not sure if this is the future we want :D
> I think it's harder to parse than what we have now.
>
> > +
> > +enum inode_state_bits {
> > + INODE_BIT(I_NEW),
> > + INODE_BIT(I_SYNC),
> > + INODE_BIT(I_LRU_ISOLATING),
> > + INODE_BIT(I_DIRTY_SYNC),
> > + INODE_BIT(I_DIRTY_DATASYNC),
> > + INODE_BIT(I_DIRTY_PAGES),
> > + INODE_BIT(I_WILL_FREE),
> > + INODE_BIT(I_FREEING),
> > + INODE_BIT(I_CLEAR),
> > + INODE_BIT(I_REFERENCED),
> > + INODE_BIT(I_LINKABLE),
> > + INODE_BIT(I_DIRTY_TIME),
> > + INODE_BIT(I_WB_SWITCH),
> > + INODE_BIT(I_OVL_INUSE),
> > + INODE_BIT(I_CREATING),
> > + INODE_BIT(I_DONTCACHE),
> > + INODE_BIT(I_SYNC_QUEUED),
> > + INODE_BIT(I_PINNING_NETFS_WB),
> > +};
>
> Good idea but I really dislike this macro indirection.
> Can't we just do the really boring?
>
> enum inode_state_bits {
> __I_BIT_NEW = 0U
> __I_BIT_SYNC = 1U
> __I_BIT_LRU_ISOLATING = 2U
> }
>
> enum inode_state_flags_t {
> I_NEW = (1U << __I_BIT_NEW)
> I_SYNC = (1U << __I_BIT_SYNC)
> I_LRU_ISOLATING = (1U << __I_BIT_LRU_ISOLATING)
> I_DIRTY_SYNC = (1U << 3)
> I_DIRTY_DATASYNC = (1U << 4)
> I_DIRTY_PAGES = (1U << 5)
> I_WILL_FREE = (1U << 6)
> I_FREEING = (1U << 7)
> I_CLEAR = (1U << 8)
> I_REFERENCED = (1U << 9)
> I_LINKABLE = (1U << 10)
> I_DIRTY_TIME = (1U << 11)
> I_WB_SWITCH = (1U << 12)
> I_OVL_INUSE = (1U << 13)
> I_CREATING = (1U << 14)
> I_DONTCACHE = (1U << 15)
> I_SYNC_QUEUED = (1U << 16)
> I_PINNING_NETFS_WB = (1U << 17)
> };
>
> Note that inode_state_wait_address() and that only works on four bits so
> we can't really use higher bits anyway without switching back to a
> scheme where we have to use unsigned long and waste for bytes for
> nothing on 64 bit.
>
> With that out of the way,
>
> Reviewed-by: Christian Brauner <brauner@kernel.org>
Yup totally, I'll fix this and add your RB. Thanks!
Josef
next prev parent reply other threads:[~2025-08-22 13:31 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-21 20:18 [PATCH 00/50] fs: rework inode reference counting Josef Bacik
2025-08-21 20:18 ` [PATCH 01/50] fs: add an i_obj_count refcount to the inode Josef Bacik
2025-08-21 20:18 ` [PATCH 02/50] fs: make the i_state flags an enum Josef Bacik
2025-08-22 11:08 ` Christian Brauner
2025-08-22 13:31 ` Josef Bacik [this message]
2025-08-22 14:36 ` David Sterba
2025-08-22 11:18 ` Sun YangKai
2025-08-22 11:42 ` [PATCH 02/50] " Alan Huang
2025-08-22 12:11 ` Sun YangKai
2025-08-22 14:40 ` [PATCH 02/50] fs: " Josef Bacik
2025-08-21 20:18 ` [PATCH 03/50] fs: hold an i_obj_count reference in wait_sb_inodes Josef Bacik
2025-08-21 20:18 ` [PATCH 04/50] fs: hold an i_obj_count reference for the i_wb_list Josef Bacik
2025-08-22 11:27 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 05/50] fs: hold an i_obj_count reference for the i_io_list Josef Bacik
2025-08-21 20:18 ` [PATCH 06/50] fs: hold an i_obj_count reference in writeback_sb_inodes Josef Bacik
2025-08-22 12:20 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 07/50] fs: hold an i_obj_count reference while on the hashtable Josef Bacik
2025-08-21 20:18 ` [PATCH 08/50] fs: hold an i_obj_count reference while on the LRU list Josef Bacik
2025-08-21 20:18 ` [PATCH 09/50] fs: hold an i_obj_count reference while on the sb inode list Josef Bacik
2025-08-21 20:18 ` [PATCH 10/50] fs: stop accessing ->i_count directly in f2fs and gfs2 Josef Bacik
2025-08-22 12:38 ` (subset) " Christian Brauner
2025-08-21 20:18 ` [PATCH 11/50] fs: hold an i_obj_count when we have an i_count reference Josef Bacik
2025-08-21 20:18 ` [PATCH 12/50] fs: rework iput logic Josef Bacik
2025-08-22 12:54 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 13/50] fs: add an I_LRU flag to the inode Josef Bacik
2025-08-21 20:18 ` [PATCH 14/50] fs: maintain a list of pinned inodes Josef Bacik
2025-08-22 14:55 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 15/50] fs: delete the inode from the LRU list on lookup Josef Bacik
2025-08-22 15:27 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 16/50] fs: change evict_inodes to use iput instead of evict directly Josef Bacik
2025-08-25 9:07 ` Christian Brauner
2025-08-25 19:35 ` Josef Bacik
2025-08-26 9:56 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 17/50] fs: hold a full ref while the inode is on a LRU Josef Bacik
2025-08-25 9:20 ` Christian Brauner
2025-08-25 10:40 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 18/50] fs: disallow 0 reference count inodes Josef Bacik
2025-08-25 10:54 ` Christian Brauner
2025-08-25 19:26 ` Josef Bacik
2025-08-26 9:28 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 19/50] fs: make evict_inodes add to the dispose list under the i_lock Josef Bacik
2025-08-21 20:18 ` [PATCH 20/50] fs: convert i_count to refcount_t Josef Bacik
2025-08-22 12:10 ` Amir Goldstein
2025-08-22 13:56 ` kernel test robot
2025-08-25 11:03 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 21/50] fs: use refcount_inc_not_zero in igrab Josef Bacik
2025-08-25 11:21 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 22/50] fs: use inode_tryget in find_inode* Josef Bacik
2025-08-25 11:26 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 23/50] fs: update find_inode_*rcu to check the i_count count Josef Bacik
2025-08-25 11:27 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 24/50] fs: use igrab in insert_inode_locked Josef Bacik
2025-08-21 20:18 ` [PATCH 25/50] fs: remove I_WILL_FREE|I_FREEING check from __inode_add_lru Josef Bacik
2025-08-21 20:18 ` [PATCH 26/50] fs: remove I_WILL_FREE|I_FREEING check in inode_pin_lru_isolating Josef Bacik
2025-08-21 20:18 ` [PATCH 27/50] fs: use inode_tryget in evict_inodes Josef Bacik
2025-08-25 11:43 ` Christian Brauner
2025-08-25 18:22 ` Josef Bacik
2025-08-21 20:18 ` [PATCH 28/50] fs: change evict_dentries_for_decrypted_inodes to use refcount Josef Bacik
2025-08-21 20:18 ` [PATCH 29/50] block: use igrab in sync_bdevs Josef Bacik
2025-08-21 20:18 ` [PATCH 30/50] bcachefs: use the refcount instead of I_WILL_FREE|I_FREEING Josef Bacik
2025-08-21 20:18 ` [PATCH 31/50] btrfs: don't check I_WILL_FREE|I_FREEING Josef Bacik
2025-08-21 20:18 ` [PATCH 32/50] fs: use igrab in drop_pagecache_sb Josef Bacik
2025-08-21 20:18 ` [PATCH 33/50] fs: stop checking I_FREEING in d_find_alias_rcu Josef Bacik
2025-08-21 20:18 ` [PATCH 34/50] ext4: stop checking I_WILL_FREE|IFREEING in ext4_check_map_extents_env Josef Bacik
2025-08-21 20:18 ` [PATCH 35/50] fs: remove I_WILL_FREE|I_FREEING from fs-writeback.c Josef Bacik
2025-08-25 11:46 ` Christian Brauner
2025-08-21 20:18 ` [PATCH 36/50] gfs2: remove I_WILL_FREE|I_FREEING usage Josef Bacik
2025-08-21 20:18 ` [PATCH 37/50] fs: remove I_WILL_FREE|I_FREEING check from dquot.c Josef Bacik
2025-08-21 20:18 ` [PATCH 38/50] notify: remove I_WILL_FREE|I_FREEING checks in fsnotify_unmount_inodes Josef Bacik
2025-08-21 20:18 ` [PATCH 39/50] xfs: remove I_FREEING check Josef Bacik
2025-08-21 20:18 ` [PATCH 40/50] landlock: remove I_FREEING|I_WILL_FREE check Josef Bacik
2025-08-21 20:18 ` [PATCH 41/50] fs: change inode_is_dirtytime_only to use refcount Josef Bacik
2025-08-21 20:18 ` [PATCH 42/50] btrfs: remove references to I_FREEING Josef Bacik
2025-08-21 20:18 ` [PATCH 43/50] ext4: remove reference to I_FREEING in inode.c Josef Bacik
2025-08-21 20:18 ` [PATCH 44/50] ext4: remove reference to I_FREEING in orphan.c Josef Bacik
2025-08-21 20:18 ` [PATCH 45/50] pnfs: use i_count refcount to determine if the inode is going away Josef Bacik
2025-08-21 20:18 ` [PATCH 46/50] fs: remove some spurious I_FREEING references in inode.c Josef Bacik
2025-08-21 20:18 ` [PATCH 47/50] xfs: remove reference to I_FREEING|I_WILL_FREE Josef Bacik
2025-08-21 20:18 ` [PATCH 48/50] ocfs2: do not set I_WILL_FREE Josef Bacik
2025-08-21 20:19 ` [PATCH 49/50] fs: remove I_FREEING|I_WILL_FREE Josef Bacik
2025-08-25 11:53 ` Christian Brauner
2025-08-21 20:19 ` [PATCH 50/50] fs: add documentation explaining the reference count rules for inodes Josef Bacik
2025-08-25 11:56 ` Christian Brauner
2025-08-22 10:51 ` [PATCH 00/50] fs: rework inode reference counting Christian Brauner
2025-08-22 13:30 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250822133151.GB927384@perftesting \
--to=josef@toxicpanda.com \
--cc=brauner@kernel.org \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.