* [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal
@ 2025-09-09 9:13 Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 01/10] fs: expand dump_inode() Mateusz Guzik
` (8 more replies)
0 siblings, 9 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
NOTE: this is a WIP not meant to be included anywhere yet and perhaps
should be split into 2 patchsets.
It is generated against against:
https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/commit/?h=vfs-6.18.inode.refcount.preliminaries
The first patch in the series is ready to use(tm) and was sent
separately here:
https://lore.kernel.org/linux-fsdevel/20250909082613.1296550-1-mjguzik@gmail.com/T/#u
It is included in this posting as the newly patched routine has to get
patched further due to the i_state accessor thing. Having it here should
make it handier to test for interested.
This is a cleaned up continuation of the churn-ey patch which merely removed I_WILL_FREE:
https://lore.kernel.org/linux-fsdevel/20250902145428.456510-1-mjguzik@gmail.com/
The entire thing is a response to the patchset by Josef Bacik concerning
refcount changes, see:
https://lore.kernel.org/linux-fsdevel/cover.1756222464.git.josef@toxicpanda.com/
I'm writing my second reply to that patchset, but in the meantime the
stuff below should facilitate work forward, regardless if the refcount
patchset goes in or not.
The patchset splits churn from actual changes.
Plain ->i_state access is still possible to reduce upfront churn, only
some of the tree got covered so far.
short rundown:
fs: hide ->i_state handling behind accessors
This is churn-ey and should largely be a nop, worst case there will be
failed lock assertions if I messed up some annotations and which should
be easy to sort out. It covers fs/*.c and friends, but no filesystems.
bcachefs: use the new ->i_state accessors
btrfs: use the new ->i_state accessors
ext4: use the new ->i_state accessors
gfs2: use the new ->i_state accessors
ocfs2: use the new ->i_state accessors
This patches only the filesystems which reference I_WILL_FREE. Again
should be a nop.
ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
Actual change, needs ocfs2 folks approval.
fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to
writeback
Actual change. I_WILL_FREE still exists as a macro but is no longer set
by anything. As of right now I'm not fully confident this is correct,
the writeback code is more fuck-ey than not. However, figuring this out
is imo a hard prerequisite for the refcount patchset by Josef anyway.
fs: retire the I_WILL_FREE flag
Churn change to whack the flag. It also uses the opportunity to do some
cosmetics.
Onto the rationale:
I. i_state
The handling in the stock kernel is highly error prone and with 0 assert
coverage.
Notably there is nothing guaranteeing the caller owns the necessary lock
when making changes. But apart from that there are spots which look at
->i_state several times and it is unclear if they think it is stable or
not. Moreover spots used on inode teardown use WRITE_ONCE, some spots in
hash lookup use READ_ONCE, but everyone else issues plain loads and
stores which invites compiler mischief.
All of this is easily preventable.
The ideal state as I see it would also hide the field behind a struct so
that plain open-coded accesses fail to compile. Not done yet to reduce
churn.
Another step not taken here but to be sorted out later is strict
handling of flags, where it is illegal to clear flags which are not
present and to set flags which are already set -- the kernel should know
whether a given flag can legally be present or not (trivial example:
I_FREEING). Setting a flag which is already set is more likely to be a
logic error than not. If it turns out there are cases where a flag can
be legally already present/missing, an additional helper can be added to
forego the assertion.
Practical examples:
spin_lock(&inode->i_lock);
if (inode_state_read(inode) & I_WHATEVER) {
....
}
spin_unlock(&inode->i_lock);
This asserts the lock is held.
But if the caller is looking to do a lockless check first, they can do
it and explicitly denote this is what they want:
if (inode_state_read_unlocked(inode) & I_WHATEVER) {
spin_lock(&inode->i_lock);
if (inode_state_read(inode) & I_WHATEVER) {
....
}
spin_unlock(&inode->i_lock);
}
Similarly:
state = inode_state_read_unlocked(inode);
if (state & I_CRAP) {
} else (state & I_MEH) {
}
...
We are guaranteed no mischief and the caller acknowledges the value in
the inode could have changed from under them and the code is READ_ONCE
(as opposed to plain ->i_state loads now).
Furthermore, should better lifecycle tracking get introduced, the
helpers can validate no flags get added when it is invalid to do so.
The *current* routines are as below. I don't care about specific
names, I do care about semantics.
/*
* i_state handling
*
* We hide all of it behind helpers so that we can validate consumers.
*/
static inline enum inode_state_flags_enum inode_state_read(struct inode *inode)
{
lockdep_assert_held(&inode->i_lock);
return inode->i_state;
}
static inline enum inode_state_flags_enum inode_state_read_unlocked(struct inode *inode)
{
return READ_ONCE(inode->i_state);
}
static inline void inode_state_add(struct inode *inode,
enum inode_state_flags_enum newflags)
{
lockdep_assert_held(&inode->i_lock);
WRITE_ONCE(inode->i_state, inode->i_state | newflags);
}
static inline void inode_state_del(struct inode *inode,
enum inode_state_flags_enum rmflags)
{
lockdep_assert_held(&inode->i_lock);
WRITE_ONCE(inode->i_state, inode->i_state & ~rmflags);
}
static inline void inode_state_set_unchecked(struct inode *inode,
enum inode_state_flags_enum newflags)
{
WRITE_ONCE(inode->i_state, newflags);
}
The inode_state_set_unchecked() crapper is there to handle early access
during inode construction (before it lands in the hash).
II. I_WILL_FREE removal
Sounds like nobody likes this flag and even the developer documenting it
in fs.h was not able to provide a justification for its existence,
merely stating how it is used.
As far as I can tell the only use was to allow ->drop_inode() handlers
to drop ->i_lock and still prevent anyone from picking up the inode. I
*suspect* this was used instead of I_FREEING because the routine could
have decided to *not* drop afterwards. Differentiating between
indicating the inode is going down vs just telling the consumer to
bugger off for the time being seemed like an ok idea.
However, the only filesystem using today is ocfs2, it always returns
"drop it" and this usage does not even have to be there. Removed in one
of the patches.
Apart from that the only use was write_inode_now() call in iput_final()
prior to setting I_FREEING anyway. This probably works as posted here,
but there might be some fuckery to sort out in writeback to truly
eliminate the flag. In the worst case it's just some work, but *so far*
I'm not staking anything on the patchset being fully correct yet.
tl;dr the flag does not have to be there, but there may be dragons in
writeback (to be seen). No matter what, shaking bugs out of this should
be considered a pre-requisite for any future work regarding inode
lifecycle (whether the refcount patchset lands or not, imo it should not
which I'll elaborate on later in that thread).
Apart from that the I_CREATING flag seems to have inconsistent handling,
but that's for another e-mail after I get a better hang of it.
So.. comments?
Mateusz Guzik (10):
fs: expand dump_inode()
fs: hide ->i_state handling behind accessors
bcachefs: use the new ->i_state accessors
btrfs: use the new ->i_state accessors
ext4: use the new ->i_state accessors
gfs2: use the new ->i_state accessors
ocfs2: use the new ->i_state accessors
ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to
writeback
fs: retire the I_WILL_FREE flag
block/bdev.c | 4 +-
fs/bcachefs/fs.c | 8 +-
fs/btrfs/inode.c | 10 +--
fs/buffer.c | 4 +-
fs/crypto/keyring.c | 2 +-
fs/crypto/keysetup.c | 2 +-
fs/dcache.c | 8 +-
fs/drop_caches.c | 2 +-
fs/ext4/inode.c | 10 +--
fs/ext4/orphan.c | 4 +-
fs/fs-writeback.c | 131 +++++++++++++++----------------
fs/gfs2/file.c | 2 +-
fs/gfs2/glops.c | 2 +-
fs/gfs2/inode.c | 4 +-
fs/gfs2/ops_fstype.c | 2 +-
fs/inode.c | 115 ++++++++++++++-------------
fs/libfs.c | 6 +-
fs/namei.c | 8 +-
fs/notify/fsnotify.c | 8 +-
fs/ocfs2/dlmglue.c | 2 +-
fs/ocfs2/inode.c | 27 +------
fs/ocfs2/inode.h | 1 -
fs/ocfs2/ocfs2_trace.h | 2 -
fs/ocfs2/super.c | 2 +-
fs/pipe.c | 2 +-
fs/quota/dquot.c | 2 +-
fs/sync.c | 2 +-
fs/xfs/scrub/common.c | 3 +-
include/linux/backing-dev.h | 5 +-
include/linux/fs.h | 75 ++++++++++++------
include/linux/writeback.h | 4 +-
include/trace/events/writeback.h | 11 ++-
security/landlock/fs.c | 12 +--
33 files changed, 249 insertions(+), 233 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 01/10] fs: expand dump_inode()
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 02/10] fs: hide ->i_state handling behind accessors Mateusz Guzik
` (7 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
This adds fs name and few fields from struct inode: i_mode, i_opflags,
i_flags and i_state.
All values printed raw, no attempt to pretty-print anything.
Compile tested on for i386 and runtime tested on amd64.
Sample output:
[ 31.450263] VFS_WARN_ON_INODE("crap") encountered for inode ffff9b10837a3240
fs sockfs mode 140777 opflags c flags 0 state 100
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/inode.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/fs/inode.c b/fs/inode.c
index 833de5457a06..e8c712211822 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2935,10 +2935,18 @@ EXPORT_SYMBOL(mode_strip_sgid);
*
* TODO: add a proper inode dumping routine, this is a stub to get debug off the
* ground.
+ *
+ * TODO: handle getting to fs type with get_kernel_nofault()?
+ * See dump_mapping() above.
*/
void dump_inode(struct inode *inode, const char *reason)
{
- pr_warn("%s encountered for inode %px", reason, inode);
+ struct super_block *sb = inode->i_sb;
+
+ pr_warn("%s encountered for inode %px\n"
+ "fs %s mode %ho opflags %hx flags %u state %x\n",
+ reason, inode, sb->s_type->name, inode->i_mode, inode->i_opflags,
+ inode->i_flags, inode->i_state);
}
EXPORT_SYMBOL(dump_inode);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 02/10] fs: hide ->i_state handling behind accessors
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 01/10] fs: expand dump_inode() Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 03/10] bcachefs: use the new ->i_state accessors Mateusz Guzik
` (6 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
block/bdev.c | 4 +-
fs/buffer.c | 4 +-
fs/crypto/keyring.c | 2 +-
fs/crypto/keysetup.c | 2 +-
fs/dcache.c | 8 +-
fs/drop_caches.c | 2 +-
fs/fs-writeback.c | 123 ++++++++++++++++---------------
fs/inode.c | 108 ++++++++++++++-------------
fs/libfs.c | 6 +-
fs/namei.c | 8 +-
fs/notify/fsnotify.c | 2 +-
fs/pipe.c | 2 +-
fs/quota/dquot.c | 2 +-
fs/sync.c | 2 +-
include/linux/backing-dev.h | 5 +-
include/linux/fs.h | 43 ++++++++++-
include/linux/writeback.h | 4 +-
include/trace/events/writeback.h | 8 +-
security/landlock/fs.c | 12 +--
19 files changed, 194 insertions(+), 153 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index b77ddd12dc06..77f04042ac67 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -67,7 +67,7 @@ static void bdev_write_inode(struct block_device *bdev)
int ret;
spin_lock(&inode->i_lock);
- while (inode->i_state & I_DIRTY) {
+ while (inode_state_read(inode) & I_DIRTY) {
spin_unlock(&inode->i_lock);
ret = write_inode_now(inode, true);
if (ret)
@@ -1265,7 +1265,7 @@ void sync_bdevs(bool wait)
struct block_device *bdev;
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW) ||
+ if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW) ||
mapping->nrpages == 0) {
spin_unlock(&inode->i_lock);
continue;
diff --git a/fs/buffer.c b/fs/buffer.c
index ead4dc85debd..1ed8c56310a4 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -611,9 +611,9 @@ int generic_buffers_fsync_noflush(struct file *file, loff_t start, loff_t end,
return err;
ret = sync_mapping_buffers(inode->i_mapping);
- if (!(inode->i_state & I_DIRTY_ALL))
+ if (!(inode_state_read_unlocked(inode) & I_DIRTY_ALL))
goto out;
- if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
+ if (datasync && !(inode_state_read_unlocked(inode) & I_DIRTY_DATASYNC))
goto out;
err = sync_inode_metadata(inode, 1);
diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
index 7557f6a88b8f..34beb60bc24e 100644
--- a/fs/crypto/keyring.c
+++ b/fs/crypto/keyring.c
@@ -957,7 +957,7 @@ static void evict_dentries_for_decrypted_inodes(struct fscrypt_master_key *mk)
list_for_each_entry(ci, &mk->mk_decrypted_inodes, ci_master_key_link) {
inode = ci->ci_inode;
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) {
+ if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) {
spin_unlock(&inode->i_lock);
continue;
}
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index c1f85715c276..d4d9f9a83253 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -859,7 +859,7 @@ int fscrypt_drop_inode(struct inode *inode)
* userspace is still using the files, inodes can be dirtied between
* then and now. We mustn't lose any writes, so skip dirty inodes here.
*/
- if (inode->i_state & I_DIRTY_ALL)
+ if (inode_state_read_unlocked(inode) & I_DIRTY_ALL)
return 0;
/*
diff --git a/fs/dcache.c b/fs/dcache.c
index 60046ae23d51..e01c8b678a6f 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -794,7 +794,7 @@ void d_mark_dontcache(struct inode *inode)
de->d_flags |= DCACHE_DONTCACHE;
spin_unlock(&de->d_lock);
}
- inode->i_state |= I_DONTCACHE;
+ inode_state_add(inode, I_DONTCACHE);
spin_unlock(&inode->i_lock);
}
EXPORT_SYMBOL(d_mark_dontcache);
@@ -1073,7 +1073,7 @@ struct dentry *d_find_alias_rcu(struct inode *inode)
spin_lock(&inode->i_lock);
// ->i_dentry and ->i_rcu are colocated, but the latter won't be
// used without having I_FREEING set, which means no aliases left
- if (likely(!(inode->i_state & I_FREEING) && !hlist_empty(l))) {
+ if (likely(!(inode_state_read(inode) & I_FREEING) && !hlist_empty(l))) {
if (S_ISDIR(inode->i_mode)) {
de = hlist_entry(l->first, struct dentry, d_u.d_alias);
} else {
@@ -1980,8 +1980,8 @@ void d_instantiate_new(struct dentry *entry, struct inode *inode)
security_d_instantiate(entry, inode);
spin_lock(&inode->i_lock);
__d_instantiate(entry, inode);
- WARN_ON(!(inode->i_state & I_NEW));
- inode->i_state &= ~I_NEW & ~I_CREATING;
+ WARN_ON(!(inode_state_read(inode) & I_NEW));
+ inode_state_del(inode, I_NEW | I_CREATING);
/*
* Pairs with the barrier in prepare_to_wait_event() to make sure
* ___wait_var_event() either sees the bit cleared or
diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index 019a8b4eaaf9..73175ac2fe92 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -28,7 +28,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
* inodes without pages but we deliberately won't in case
* we need to reschedule to avoid softlockups.
*/
- if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
+ if ((inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) ||
(mapping_empty(inode->i_mapping) && !need_resched())) {
spin_unlock(&inode->i_lock);
continue;
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 6088a67b2aae..67c2157a7d21 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -121,7 +121,7 @@ static bool inode_io_list_move_locked(struct inode *inode,
{
assert_spin_locked(&wb->list_lock);
assert_spin_locked(&inode->i_lock);
- WARN_ON_ONCE(inode->i_state & I_FREEING);
+ WARN_ON_ONCE(inode_state_read(inode) & I_FREEING);
list_move(&inode->i_io_list, head);
@@ -304,9 +304,9 @@ static void inode_cgwb_move_to_attached(struct inode *inode,
{
assert_spin_locked(&wb->list_lock);
assert_spin_locked(&inode->i_lock);
- WARN_ON_ONCE(inode->i_state & I_FREEING);
+ WARN_ON_ONCE(inode_state_read(inode) & I_FREEING);
- inode->i_state &= ~I_SYNC_QUEUED;
+ inode_state_del(inode, I_SYNC_QUEUED);
if (wb != &wb->bdi->wb)
list_move(&inode->i_io_list, &wb->b_attached);
else
@@ -408,7 +408,7 @@ static bool inode_do_switch_wbs(struct inode *inode,
* Once I_FREEING or I_WILL_FREE are visible under i_lock, the eviction
* path owns the inode and we shouldn't modify ->i_io_list.
*/
- if (unlikely(inode->i_state & (I_FREEING | I_WILL_FREE)))
+ if (unlikely(inode_state_read(inode) & (I_FREEING | I_WILL_FREE)))
goto skip_switch;
trace_inode_switch_wbs(inode, old_wb, new_wb);
@@ -452,7 +452,7 @@ static bool inode_do_switch_wbs(struct inode *inode,
if (!list_empty(&inode->i_io_list)) {
inode->i_wb = new_wb;
- if (inode->i_state & I_DIRTY_ALL) {
+ if (inode_state_read(inode) & I_DIRTY_ALL) {
struct inode *pos;
list_for_each_entry(pos, &new_wb->b_dirty, i_io_list)
@@ -475,10 +475,11 @@ static bool inode_do_switch_wbs(struct inode *inode,
switched = true;
skip_switch:
/*
- * Paired with load_acquire in unlocked_inode_to_wb_begin() and
+ * Paired with an acquire fence in unlocked_inode_to_wb_begin() and
* ensures that the new wb is visible if they see !I_WB_SWITCH.
*/
- smp_store_release(&inode->i_state, inode->i_state & ~I_WB_SWITCH);
+ smp_wmb();
+ inode_state_del(inode, I_WB_SWITCH);
xa_unlock_irq(&mapping->i_pages);
spin_unlock(&inode->i_lock);
@@ -560,12 +561,12 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
/* while holding I_WB_SWITCH, no one else can update the association */
spin_lock(&inode->i_lock);
if (!(inode->i_sb->s_flags & SB_ACTIVE) ||
- inode->i_state & (I_WB_SWITCH | I_FREEING | I_WILL_FREE) ||
+ inode_state_read(inode) & (I_WB_SWITCH | I_FREEING | I_WILL_FREE) ||
inode_to_wb(inode) == new_wb) {
spin_unlock(&inode->i_lock);
return false;
}
- inode->i_state |= I_WB_SWITCH;
+ inode_state_add(inode, I_WB_SWITCH);
__iget(inode);
spin_unlock(&inode->i_lock);
@@ -587,7 +588,7 @@ static void inode_switch_wbs(struct inode *inode, int new_wb_id)
struct inode_switch_wbs_context *isw;
/* noop if seems to be already in progress */
- if (inode->i_state & I_WB_SWITCH)
+ if (inode_state_read_unlocked(inode) & I_WB_SWITCH)
return;
/* avoid queueing a new switch if too many are already in flight */
@@ -1197,9 +1198,9 @@ static void inode_cgwb_move_to_attached(struct inode *inode,
{
assert_spin_locked(&wb->list_lock);
assert_spin_locked(&inode->i_lock);
- WARN_ON_ONCE(inode->i_state & I_FREEING);
+ WARN_ON_ONCE(inode_state_read(inode) & I_FREEING);
- inode->i_state &= ~I_SYNC_QUEUED;
+ inode_state_del(inode, I_SYNC_QUEUED);
list_del_init(&inode->i_io_list);
wb_io_lists_depopulated(wb);
}
@@ -1312,7 +1313,7 @@ void inode_io_list_del(struct inode *inode)
wb = inode_to_wb_and_lock_list(inode);
spin_lock(&inode->i_lock);
- inode->i_state &= ~I_SYNC_QUEUED;
+ inode_state_del(inode, I_SYNC_QUEUED);
list_del_init(&inode->i_io_list);
wb_io_lists_depopulated(wb);
@@ -1370,13 +1371,13 @@ static void redirty_tail_locked(struct inode *inode, struct bdi_writeback *wb)
{
assert_spin_locked(&inode->i_lock);
- inode->i_state &= ~I_SYNC_QUEUED;
+ inode_state_del(inode, I_SYNC_QUEUED);
/*
* When the inode is being freed just don't bother with dirty list
* tracking. Flush worker will ignore this inode anyway and it will
* trigger assertions in inode_io_list_move_locked().
*/
- if (inode->i_state & I_FREEING) {
+ if (inode_state_read(inode) & I_FREEING) {
list_del_init(&inode->i_io_list);
wb_io_lists_depopulated(wb);
return;
@@ -1410,7 +1411,7 @@ static void inode_sync_complete(struct inode *inode)
{
assert_spin_locked(&inode->i_lock);
- inode->i_state &= ~I_SYNC;
+ inode_state_del(inode, I_SYNC);
/* If inode is clean an unused, put it into LRU now... */
inode_add_lru(inode);
/* Called with inode->i_lock which ensures memory ordering. */
@@ -1454,7 +1455,7 @@ static int move_expired_inodes(struct list_head *delaying_queue,
spin_lock(&inode->i_lock);
list_move(&inode->i_io_list, &tmp);
moved++;
- inode->i_state |= I_SYNC_QUEUED;
+ inode_state_add(inode, I_SYNC_QUEUED);
spin_unlock(&inode->i_lock);
if (sb_is_blkdev_sb(inode->i_sb))
continue;
@@ -1540,14 +1541,14 @@ void inode_wait_for_writeback(struct inode *inode)
assert_spin_locked(&inode->i_lock);
- if (!(inode->i_state & I_SYNC))
+ if (!(inode_state_read(inode) & I_SYNC))
return;
wq_head = inode_bit_waitqueue(&wqe, inode, __I_SYNC);
for (;;) {
prepare_to_wait_event(wq_head, &wqe.wq_entry, TASK_UNINTERRUPTIBLE);
/* Checking I_SYNC with inode->i_lock guarantees memory ordering. */
- if (!(inode->i_state & I_SYNC))
+ if (!(inode_state_read(inode) & I_SYNC))
break;
spin_unlock(&inode->i_lock);
schedule();
@@ -1573,7 +1574,7 @@ static void inode_sleep_on_writeback(struct inode *inode)
wq_head = inode_bit_waitqueue(&wqe, inode, __I_SYNC);
prepare_to_wait_event(wq_head, &wqe.wq_entry, TASK_UNINTERRUPTIBLE);
/* Checking I_SYNC with inode->i_lock guarantees memory ordering. */
- sleep = !!(inode->i_state & I_SYNC);
+ sleep = !!(inode_state_read(inode) & I_SYNC);
spin_unlock(&inode->i_lock);
if (sleep)
schedule();
@@ -1592,7 +1593,7 @@ static void requeue_inode(struct inode *inode, struct bdi_writeback *wb,
struct writeback_control *wbc,
unsigned long dirtied_before)
{
- if (inode->i_state & I_FREEING)
+ if (inode_state_read(inode) & I_FREEING)
return;
/*
@@ -1600,7 +1601,7 @@ static void requeue_inode(struct inode *inode, struct bdi_writeback *wb,
* shot. If still dirty, it will be redirty_tail()'ed below. Update
* the dirty time to prevent enqueue and sync it again.
*/
- if ((inode->i_state & I_DIRTY) &&
+ if ((inode_state_read(inode) & I_DIRTY) &&
(wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages))
inode->dirtied_when = jiffies;
@@ -1611,7 +1612,7 @@ static void requeue_inode(struct inode *inode, struct bdi_writeback *wb,
* is odd for clean inodes, it can happen for some
* filesystems so handle that gracefully.
*/
- if (inode->i_state & I_DIRTY_ALL)
+ if (inode_state_read(inode) & I_DIRTY_ALL)
redirty_tail_locked(inode, wb);
else
inode_cgwb_move_to_attached(inode, wb);
@@ -1637,17 +1638,17 @@ static void requeue_inode(struct inode *inode, struct bdi_writeback *wb,
*/
redirty_tail_locked(inode, wb);
}
- } else if (inode->i_state & I_DIRTY) {
+ } else if (inode_state_read(inode) & I_DIRTY) {
/*
* Filesystems can dirty the inode during writeback operations,
* such as delayed allocation during submission or metadata
* updates after data IO completion.
*/
redirty_tail_locked(inode, wb);
- } else if (inode->i_state & I_DIRTY_TIME) {
+ } else if (inode_state_read(inode) & I_DIRTY_TIME) {
inode->dirtied_when = jiffies;
inode_io_list_move_locked(inode, wb, &wb->b_dirty_time);
- inode->i_state &= ~I_SYNC_QUEUED;
+ inode_state_del(inode, I_SYNC_QUEUED);
} else {
/* The inode is clean. Remove from writeback lists. */
inode_cgwb_move_to_attached(inode, wb);
@@ -1673,7 +1674,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
unsigned dirty;
int ret;
- WARN_ON(!(inode->i_state & I_SYNC));
+ WARN_ON(!(inode_state_read_unlocked(inode) & I_SYNC));
trace_writeback_single_inode_start(inode, wbc, nr_to_write);
@@ -1697,7 +1698,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
* mark_inode_dirty_sync() to notify the filesystem about it and to
* change I_DIRTY_TIME into I_DIRTY_SYNC.
*/
- if ((inode->i_state & I_DIRTY_TIME) &&
+ if ((inode_state_read_unlocked(inode) & I_DIRTY_TIME) &&
(wbc->sync_mode == WB_SYNC_ALL ||
time_after(jiffies, inode->dirtied_time_when +
dirtytime_expire_interval * HZ))) {
@@ -1712,8 +1713,8 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
* after handling timestamp expiration, as that may dirty the inode too.
*/
spin_lock(&inode->i_lock);
- dirty = inode->i_state & I_DIRTY;
- inode->i_state &= ~dirty;
+ dirty = inode_state_read(inode) & I_DIRTY;
+ inode_state_del(inode, dirty);
/*
* Paired with smp_mb() in __mark_inode_dirty(). This allows
@@ -1729,10 +1730,10 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
smp_mb();
if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
- inode->i_state |= I_DIRTY_PAGES;
- else if (unlikely(inode->i_state & I_PINNING_NETFS_WB)) {
- if (!(inode->i_state & I_DIRTY_PAGES)) {
- inode->i_state &= ~I_PINNING_NETFS_WB;
+ inode_state_add(inode, I_DIRTY_PAGES);
+ else if (unlikely(inode_state_read(inode) & I_PINNING_NETFS_WB)) {
+ if (!(inode_state_read(inode) & I_DIRTY_PAGES)) {
+ inode_state_del(inode, I_PINNING_NETFS_WB);
wbc->unpinned_netfs_wb = true;
dirty |= I_PINNING_NETFS_WB; /* Cause write_inode */
}
@@ -1768,11 +1769,11 @@ static int writeback_single_inode(struct inode *inode,
spin_lock(&inode->i_lock);
if (!icount_read(inode))
- WARN_ON(!(inode->i_state & (I_WILL_FREE|I_FREEING)));
+ WARN_ON(!(inode_state_read(inode) & (I_WILL_FREE|I_FREEING)));
else
- WARN_ON(inode->i_state & I_WILL_FREE);
+ WARN_ON(inode_state_read(inode) & I_WILL_FREE);
- if (inode->i_state & I_SYNC) {
+ if (inode_state_read(inode) & I_SYNC) {
/*
* Writeback is already running on the inode. For WB_SYNC_NONE,
* that's enough and we can just return. For WB_SYNC_ALL, we
@@ -1783,7 +1784,7 @@ static int writeback_single_inode(struct inode *inode,
goto out;
inode_wait_for_writeback(inode);
}
- WARN_ON(inode->i_state & I_SYNC);
+ WARN_ON(inode_state_read(inode) & I_SYNC);
/*
* If the inode is already fully clean, then there's nothing to do.
*
@@ -1791,11 +1792,11 @@ static int writeback_single_inode(struct inode *inode,
* still under writeback, e.g. due to prior WB_SYNC_NONE writeback. If
* there are any such pages, we'll need to wait for them.
*/
- if (!(inode->i_state & I_DIRTY_ALL) &&
+ if (!(inode_state_read(inode) & I_DIRTY_ALL) &&
(wbc->sync_mode != WB_SYNC_ALL ||
!mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK)))
goto out;
- inode->i_state |= I_SYNC;
+ inode_state_add(inode, I_SYNC);
wbc_attach_and_unlock_inode(wbc, inode);
ret = __writeback_single_inode(inode, wbc);
@@ -1808,18 +1809,18 @@ static int writeback_single_inode(struct inode *inode,
* If the inode is freeing, its i_io_list shoudn't be updated
* as it can be finally deleted at this moment.
*/
- if (!(inode->i_state & I_FREEING)) {
+ if (!(inode_state_read(inode) & I_FREEING)) {
/*
* If the inode is now fully clean, then it can be safely
* removed from its writeback list (if any). Otherwise the
* flusher threads are responsible for the writeback lists.
*/
- if (!(inode->i_state & I_DIRTY_ALL))
+ if (!(inode_state_read(inode) & I_DIRTY_ALL))
inode_cgwb_move_to_attached(inode, wb);
- else if (!(inode->i_state & I_SYNC_QUEUED)) {
- if ((inode->i_state & I_DIRTY))
+ else if (!(inode_state_read(inode) & I_SYNC_QUEUED)) {
+ if ((inode_state_read(inode) & I_DIRTY))
redirty_tail_locked(inode, wb);
- else if (inode->i_state & I_DIRTY_TIME) {
+ else if (inode_state_read(inode) & I_DIRTY_TIME) {
inode->dirtied_when = jiffies;
inode_io_list_move_locked(inode,
wb,
@@ -1928,12 +1929,12 @@ static long writeback_sb_inodes(struct super_block *sb,
* kind writeout is handled by the freer.
*/
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING | I_WILL_FREE)) {
redirty_tail_locked(inode, wb);
spin_unlock(&inode->i_lock);
continue;
}
- if ((inode->i_state & I_SYNC) && wbc.sync_mode != WB_SYNC_ALL) {
+ if ((inode_state_read(inode) & I_SYNC) && wbc.sync_mode != WB_SYNC_ALL) {
/*
* If this inode is locked for writeback and we are not
* doing writeback-for-data-integrity, move it to
@@ -1955,14 +1956,14 @@ static long writeback_sb_inodes(struct super_block *sb,
* are doing WB_SYNC_NONE writeback. So this catches only the
* WB_SYNC_ALL case.
*/
- if (inode->i_state & I_SYNC) {
+ if (inode_state_read(inode) & I_SYNC) {
/* Wait for I_SYNC. This function drops i_lock... */
inode_sleep_on_writeback(inode);
/* Inode may be gone, start again */
spin_lock(&wb->list_lock);
continue;
}
- inode->i_state |= I_SYNC;
+ inode_state_add(inode, I_SYNC);
wbc_attach_and_unlock_inode(&wbc, inode);
write_chunk = writeback_chunk_size(wb, work);
@@ -2000,7 +2001,7 @@ static long writeback_sb_inodes(struct super_block *sb,
*/
tmp_wb = inode_to_wb_and_lock_list(inode);
spin_lock(&inode->i_lock);
- if (!(inode->i_state & I_DIRTY_ALL))
+ if (!(inode_state_read(inode) & I_DIRTY_ALL))
total_wrote++;
requeue_inode(inode, tmp_wb, &wbc, dirtied_before);
inode_sync_complete(inode);
@@ -2506,10 +2507,10 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* We tell ->dirty_inode callback that timestamps need to
* be updated by setting I_DIRTY_TIME in flags.
*/
- if (inode->i_state & I_DIRTY_TIME) {
+ if (inode_state_read_unlocked(inode) & I_DIRTY_TIME) {
spin_lock(&inode->i_lock);
- if (inode->i_state & I_DIRTY_TIME) {
- inode->i_state &= ~I_DIRTY_TIME;
+ if (inode_state_read(inode) & I_DIRTY_TIME) {
+ inode_state_del(inode, I_DIRTY_TIME);
flags |= I_DIRTY_TIME;
}
spin_unlock(&inode->i_lock);
@@ -2546,16 +2547,16 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
smp_mb();
- if ((inode->i_state & flags) == flags)
+ if ((inode_state_read_unlocked(inode) & flags) == flags)
return;
spin_lock(&inode->i_lock);
- if ((inode->i_state & flags) != flags) {
- const int was_dirty = inode->i_state & I_DIRTY;
+ if ((inode_state_read(inode) & flags) != flags) {
+ const int was_dirty = inode_state_read(inode) & I_DIRTY;
inode_attach_wb(inode, NULL);
- inode->i_state |= flags;
+ inode_state_add(inode, flags);
/*
* Grab inode's wb early because it requires dropping i_lock and we
@@ -2574,7 +2575,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
* the inode it will place it on the appropriate superblock
* list, based upon its state.
*/
- if (inode->i_state & I_SYNC_QUEUED)
+ if (inode_state_read(inode) & I_SYNC_QUEUED)
goto out_unlock;
/*
@@ -2585,7 +2586,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
if (inode_unhashed(inode))
goto out_unlock;
}
- if (inode->i_state & I_FREEING)
+ if (inode_state_read(inode) & I_FREEING)
goto out_unlock;
/*
@@ -2600,7 +2601,7 @@ void __mark_inode_dirty(struct inode *inode, int flags)
if (dirtytime)
inode->dirtied_time_when = jiffies;
- if (inode->i_state & I_DIRTY)
+ if (inode_state_read(inode) & I_DIRTY)
dirty_list = &wb->b_dirty;
else
dirty_list = &wb->b_dirty_time;
@@ -2696,7 +2697,7 @@ static void wait_sb_inodes(struct super_block *sb)
spin_unlock_irq(&sb->s_inode_wblist_lock);
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
+ if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) {
spin_unlock(&inode->i_lock);
spin_lock_irq(&sb->s_inode_wblist_lock);
diff --git a/fs/inode.c b/fs/inode.c
index e8c712211822..20f36d54348c 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -233,7 +233,7 @@ int inode_init_always_gfp(struct super_block *sb, struct inode *inode, gfp_t gfp
inode->i_sb = sb;
inode->i_blkbits = sb->s_blocksize_bits;
inode->i_flags = 0;
- inode->i_state = 0;
+ inode_state_set_unchecked(inode, 0);
atomic64_set(&inode->i_sequence, 0);
atomic_set(&inode->i_count, 1);
inode->i_op = &empty_iops;
@@ -471,7 +471,7 @@ EXPORT_SYMBOL(set_nlink);
void inc_nlink(struct inode *inode)
{
if (unlikely(inode->i_nlink == 0)) {
- WARN_ON(!(inode->i_state & I_LINKABLE));
+ WARN_ON(!(inode_state_read_unlocked(inode) & I_LINKABLE));
atomic_long_dec(&inode->i_sb->s_remove_count);
}
@@ -532,7 +532,7 @@ EXPORT_SYMBOL(ihold);
static void __inode_add_lru(struct inode *inode, bool rotate)
{
- if (inode->i_state & (I_DIRTY_ALL | I_SYNC | I_FREEING | I_WILL_FREE))
+ if (inode_state_read(inode) & (I_DIRTY_ALL | I_SYNC | I_FREEING | I_WILL_FREE))
return;
if (icount_read(inode))
return;
@@ -544,7 +544,7 @@ static void __inode_add_lru(struct inode *inode, bool rotate)
if (list_lru_add_obj(&inode->i_sb->s_inode_lru, &inode->i_lru))
this_cpu_inc(nr_unused);
else if (rotate)
- inode->i_state |= I_REFERENCED;
+ inode_state_add(inode, I_REFERENCED);
}
struct wait_queue_head *inode_bit_waitqueue(struct wait_bit_queue_entry *wqe,
@@ -577,15 +577,15 @@ static void inode_lru_list_del(struct inode *inode)
static void inode_pin_lru_isolating(struct inode *inode)
{
lockdep_assert_held(&inode->i_lock);
- WARN_ON(inode->i_state & (I_LRU_ISOLATING | I_FREEING | I_WILL_FREE));
- inode->i_state |= I_LRU_ISOLATING;
+ WARN_ON(inode_state_read(inode) & (I_LRU_ISOLATING | I_FREEING | I_WILL_FREE));
+ inode_state_add(inode, I_LRU_ISOLATING);
}
static void inode_unpin_lru_isolating(struct inode *inode)
{
spin_lock(&inode->i_lock);
- WARN_ON(!(inode->i_state & I_LRU_ISOLATING));
- inode->i_state &= ~I_LRU_ISOLATING;
+ WARN_ON(!(inode_state_read(inode) & I_LRU_ISOLATING));
+ inode_state_del(inode, I_LRU_ISOLATING);
/* Called with inode->i_lock which ensures memory ordering. */
inode_wake_up_bit(inode, __I_LRU_ISOLATING);
spin_unlock(&inode->i_lock);
@@ -597,7 +597,7 @@ static void inode_wait_for_lru_isolating(struct inode *inode)
struct wait_queue_head *wq_head;
lockdep_assert_held(&inode->i_lock);
- if (!(inode->i_state & I_LRU_ISOLATING))
+ if (!(inode_state_read(inode) & I_LRU_ISOLATING))
return;
wq_head = inode_bit_waitqueue(&wqe, inode, __I_LRU_ISOLATING);
@@ -607,14 +607,14 @@ static void inode_wait_for_lru_isolating(struct inode *inode)
* Checking I_LRU_ISOLATING with inode->i_lock guarantees
* memory ordering.
*/
- if (!(inode->i_state & I_LRU_ISOLATING))
+ if (!(inode_state_read(inode) & I_LRU_ISOLATING))
break;
spin_unlock(&inode->i_lock);
schedule();
spin_lock(&inode->i_lock);
}
finish_wait(wq_head, &wqe.wq_entry);
- WARN_ON(inode->i_state & I_LRU_ISOLATING);
+ WARN_ON(inode_state_read(inode) & I_LRU_ISOLATING);
}
/**
@@ -761,11 +761,11 @@ void clear_inode(struct inode *inode)
*/
xa_unlock_irq(&inode->i_data.i_pages);
BUG_ON(!list_empty(&inode->i_data.i_private_list));
- BUG_ON(!(inode->i_state & I_FREEING));
- BUG_ON(inode->i_state & I_CLEAR);
+ BUG_ON(!(inode_state_read_unlocked(inode) & I_FREEING));
+ BUG_ON(inode_state_read_unlocked(inode) & I_CLEAR);
BUG_ON(!list_empty(&inode->i_wb_list));
/* don't need i_lock here, no concurrent mods to i_state */
- inode->i_state = I_FREEING | I_CLEAR;
+ inode_state_set_unchecked(inode, I_FREEING | I_CLEAR);
}
EXPORT_SYMBOL(clear_inode);
@@ -786,7 +786,7 @@ static void evict(struct inode *inode)
{
const struct super_operations *op = inode->i_sb->s_op;
- BUG_ON(!(inode->i_state & I_FREEING));
+ BUG_ON(!(inode_state_read_unlocked(inode) & I_FREEING));
BUG_ON(!list_empty(&inode->i_lru));
if (!list_empty(&inode->i_io_list))
@@ -829,7 +829,7 @@ static void evict(struct inode *inode)
* This also means we don't need any fences for the call below.
*/
inode_wake_up_bit(inode, __I_NEW);
- BUG_ON(inode->i_state != (I_FREEING | I_CLEAR));
+ BUG_ON(inode_state_read_unlocked(inode) != (I_FREEING | I_CLEAR));
destroy_inode(inode);
}
@@ -879,12 +879,12 @@ void evict_inodes(struct super_block *sb)
spin_unlock(&inode->i_lock);
continue;
}
- if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING | I_WILL_FREE)) {
spin_unlock(&inode->i_lock);
continue;
}
- inode->i_state |= I_FREEING;
+ inode_state_add(inode, I_FREEING);
inode_lru_list_del(inode);
spin_unlock(&inode->i_lock);
list_add(&inode->i_lru, &dispose);
@@ -938,7 +938,7 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
* sync, or the last page cache deletion will requeue them.
*/
if (icount_read(inode) ||
- (inode->i_state & ~I_REFERENCED) ||
+ (inode_state_read(inode) & ~I_REFERENCED) ||
!mapping_shrinkable(&inode->i_data)) {
list_lru_isolate(lru, &inode->i_lru);
spin_unlock(&inode->i_lock);
@@ -947,8 +947,8 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
}
/* Recently referenced inodes get one more pass */
- if (inode->i_state & I_REFERENCED) {
- inode->i_state &= ~I_REFERENCED;
+ if (inode_state_read(inode) & I_REFERENCED) {
+ inode_state_del(inode, I_REFERENCED);
spin_unlock(&inode->i_lock);
return LRU_ROTATE;
}
@@ -975,8 +975,8 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
return LRU_RETRY;
}
- WARN_ON(inode->i_state & I_NEW);
- inode->i_state |= I_FREEING;
+ WARN_ON(inode_state_read(inode) & I_NEW);
+ inode_state_add(inode, I_FREEING);
list_lru_isolate_move(lru, &inode->i_lru, freeable);
spin_unlock(&inode->i_lock);
@@ -1025,11 +1025,11 @@ static struct inode *find_inode(struct super_block *sb,
if (!test(inode, data))
continue;
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
+ if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE)) {
__wait_on_freeing_inode(inode, is_inode_hash_locked);
goto repeat;
}
- if (unlikely(inode->i_state & I_CREATING)) {
+ if (unlikely(inode_state_read(inode) & I_CREATING)) {
spin_unlock(&inode->i_lock);
rcu_read_unlock();
return ERR_PTR(-ESTALE);
@@ -1066,11 +1066,11 @@ static struct inode *find_inode_fast(struct super_block *sb,
if (inode->i_sb != sb)
continue;
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
+ if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE)) {
__wait_on_freeing_inode(inode, is_inode_hash_locked);
goto repeat;
}
- if (unlikely(inode->i_state & I_CREATING)) {
+ if (unlikely(inode_state_read(inode) & I_CREATING)) {
spin_unlock(&inode->i_lock);
rcu_read_unlock();
return ERR_PTR(-ESTALE);
@@ -1180,8 +1180,8 @@ void unlock_new_inode(struct inode *inode)
{
lockdep_annotate_inode_mutex_key(inode);
spin_lock(&inode->i_lock);
- WARN_ON(!(inode->i_state & I_NEW));
- inode->i_state &= ~I_NEW & ~I_CREATING;
+ WARN_ON(!(inode_state_read(inode) & I_NEW));
+ inode_state_del(inode, I_NEW | I_CREATING);
/*
* Pairs with the barrier in prepare_to_wait_event() to make sure
* ___wait_var_event() either sees the bit cleared or
@@ -1197,8 +1197,8 @@ void discard_new_inode(struct inode *inode)
{
lockdep_annotate_inode_mutex_key(inode);
spin_lock(&inode->i_lock);
- WARN_ON(!(inode->i_state & I_NEW));
- inode->i_state &= ~I_NEW;
+ WARN_ON(!(inode_state_read(inode) & I_NEW));
+ inode_state_del(inode, I_NEW);
/*
* Pairs with the barrier in prepare_to_wait_event() to make sure
* ___wait_var_event() either sees the bit cleared or
@@ -1308,7 +1308,7 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval,
* caller is responsible for filling in the contents
*/
spin_lock(&inode->i_lock);
- inode->i_state |= I_NEW;
+ inode_state_add(inode, I_NEW);
hlist_add_head_rcu(&inode->i_hash, head);
spin_unlock(&inode->i_lock);
@@ -1445,7 +1445,7 @@ struct inode *iget_locked(struct super_block *sb, unsigned long ino)
if (!old) {
inode->i_ino = ino;
spin_lock(&inode->i_lock);
- inode->i_state = I_NEW;
+ inode_state_add(inode, I_NEW);
hlist_add_head_rcu(&inode->i_hash, head);
spin_unlock(&inode->i_lock);
spin_unlock(&inode_hash_lock);
@@ -1538,7 +1538,7 @@ EXPORT_SYMBOL(iunique);
struct inode *igrab(struct inode *inode)
{
spin_lock(&inode->i_lock);
- if (!(inode->i_state & (I_FREEING|I_WILL_FREE))) {
+ if (!(inode_state_read(inode) & (I_FREEING|I_WILL_FREE))) {
__iget(inode);
spin_unlock(&inode->i_lock);
} else {
@@ -1728,7 +1728,7 @@ struct inode *find_inode_rcu(struct super_block *sb, unsigned long hashval,
hlist_for_each_entry_rcu(inode, head, i_hash) {
if (inode->i_sb == sb &&
- !(READ_ONCE(inode->i_state) & (I_FREEING | I_WILL_FREE)) &&
+ !(inode_state_read_unlocked(inode) & (I_FREEING | I_WILL_FREE)) &&
test(inode, data))
return inode;
}
@@ -1767,7 +1767,7 @@ struct inode *find_inode_by_ino_rcu(struct super_block *sb,
hlist_for_each_entry_rcu(inode, head, i_hash) {
if (inode->i_ino == ino &&
inode->i_sb == sb &&
- !(READ_ONCE(inode->i_state) & (I_FREEING | I_WILL_FREE)))
+ !(inode_state_read_unlocked(inode) & (I_FREEING | I_WILL_FREE)))
return inode;
}
return NULL;
@@ -1789,7 +1789,7 @@ int insert_inode_locked(struct inode *inode)
if (old->i_sb != sb)
continue;
spin_lock(&old->i_lock);
- if (old->i_state & (I_FREEING|I_WILL_FREE)) {
+ if (inode_state_read(old) & (I_FREEING|I_WILL_FREE)) {
spin_unlock(&old->i_lock);
continue;
}
@@ -1797,13 +1797,13 @@ int insert_inode_locked(struct inode *inode)
}
if (likely(!old)) {
spin_lock(&inode->i_lock);
- inode->i_state |= I_NEW | I_CREATING;
+ inode_state_add(inode, I_NEW | I_CREATING);
hlist_add_head_rcu(&inode->i_hash, head);
spin_unlock(&inode->i_lock);
spin_unlock(&inode_hash_lock);
return 0;
}
- if (unlikely(old->i_state & I_CREATING)) {
+ if (unlikely(inode_state_read(old) & I_CREATING)) {
spin_unlock(&old->i_lock);
spin_unlock(&inode_hash_lock);
return -EBUSY;
@@ -1826,7 +1826,12 @@ int insert_inode_locked4(struct inode *inode, unsigned long hashval,
{
struct inode *old;
- inode->i_state |= I_CREATING;
+ /*
+ * XXX ugly af
+ *
+ * this routine needs to die
+ */
+ inode_state_set_unchecked(inode, inode_state_read_unlocked(inode) | I_CREATING);
old = inode_insert5(inode, hashval, test, NULL, data);
if (old != inode) {
@@ -1858,10 +1863,9 @@ static void iput_final(struct inode *inode)
{
struct super_block *sb = inode->i_sb;
const struct super_operations *op = inode->i_sb->s_op;
- unsigned long state;
int drop;
- WARN_ON(inode->i_state & I_NEW);
+ WARN_ON(inode_state_read(inode) & I_NEW);
if (op->drop_inode)
drop = op->drop_inode(inode);
@@ -1869,27 +1873,25 @@ static void iput_final(struct inode *inode)
drop = generic_drop_inode(inode);
if (!drop &&
- !(inode->i_state & I_DONTCACHE) &&
+ !(inode_state_read(inode) & I_DONTCACHE) &&
(sb->s_flags & SB_ACTIVE)) {
__inode_add_lru(inode, true);
spin_unlock(&inode->i_lock);
return;
}
- state = inode->i_state;
if (!drop) {
- WRITE_ONCE(inode->i_state, state | I_WILL_FREE);
+ inode_state_add(inode, I_WILL_FREE);
spin_unlock(&inode->i_lock);
write_inode_now(inode, 1);
spin_lock(&inode->i_lock);
- state = inode->i_state;
- WARN_ON(state & I_NEW);
- state &= ~I_WILL_FREE;
+ inode_state_del(inode, I_WILL_FREE);
+ WARN_ON(inode_state_read(inode) & I_NEW);
}
- WRITE_ONCE(inode->i_state, state | I_FREEING);
+ inode_state_add(inode, I_FREEING);
if (!list_empty(&inode->i_lru))
inode_lru_list_del(inode);
spin_unlock(&inode->i_lock);
@@ -1913,7 +1915,7 @@ void iput(struct inode *inode)
retry:
lockdep_assert_not_held(&inode->i_lock);
- VFS_BUG_ON_INODE(inode->i_state & I_CLEAR, inode);
+ VFS_BUG_ON_INODE(inode_state_read_unlocked(inode) & I_CLEAR, inode);
/*
* Note this assert is technically racy as if the count is bogusly
* equal to one, then two CPUs racing to further drop it can both
@@ -1924,14 +1926,14 @@ void iput(struct inode *inode)
if (atomic_add_unless(&inode->i_count, -1, 1))
return;
- if ((inode->i_state & I_DIRTY_TIME) && inode->i_nlink) {
+ if ((inode_state_read_unlocked(inode) & I_DIRTY_TIME) && inode->i_nlink) {
trace_writeback_lazytime_iput(inode);
mark_inode_dirty_sync(inode);
goto retry;
}
spin_lock(&inode->i_lock);
- if (unlikely((inode->i_state & I_DIRTY_TIME) && inode->i_nlink)) {
+ if (unlikely((inode_state_read(inode) & I_DIRTY_TIME) && inode->i_nlink)) {
spin_unlock(&inode->i_lock);
goto retry;
}
@@ -2946,7 +2948,7 @@ void dump_inode(struct inode *inode, const char *reason)
pr_warn("%s encountered for inode %px\n"
"fs %s mode %ho opflags %hx flags %u state %x\n",
reason, inode, sb->s_type->name, inode->i_mode, inode->i_opflags,
- inode->i_flags, inode->i_state);
+ inode->i_flags, inode_state_read_unlocked(inode));
}
EXPORT_SYMBOL(dump_inode);
diff --git a/fs/libfs.c b/fs/libfs.c
index ce8c496a6940..7fe8db827d5e 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1542,9 +1542,9 @@ int __generic_file_fsync(struct file *file, loff_t start, loff_t end,
inode_lock(inode);
ret = sync_mapping_buffers(inode->i_mapping);
- if (!(inode->i_state & I_DIRTY_ALL))
+ if (!(inode_state_read_unlocked(inode) & I_DIRTY_ALL))
goto out;
- if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
+ if (datasync && !(inode_state_read_unlocked(inode) & I_DIRTY_DATASYNC))
goto out;
err = sync_inode_metadata(inode, 1);
@@ -1664,7 +1664,7 @@ struct inode *alloc_anon_inode(struct super_block *s)
* list because mark_inode_dirty() will think
* that it already _is_ on the dirty list.
*/
- inode->i_state = I_DIRTY;
+ inode_state_set_unchecked(inode, I_DIRTY);
/*
* Historically anonymous inodes don't have a type at all and
* userspace has come to rely on this.
diff --git a/fs/namei.c b/fs/namei.c
index cd43ff89fbaa..250360f0219b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3948,7 +3948,7 @@ int vfs_tmpfile(struct mnt_idmap *idmap,
inode = file_inode(file);
if (!(open_flag & O_EXCL)) {
spin_lock(&inode->i_lock);
- inode->i_state |= I_LINKABLE;
+ inode_state_add(inode, I_LINKABLE);
spin_unlock(&inode->i_lock);
}
security_inode_post_create_tmpfile(idmap, inode);
@@ -4844,7 +4844,7 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
inode_lock(inode);
/* Make sure we don't allow creating hardlink to an unlinked file */
- if (inode->i_nlink == 0 && !(inode->i_state & I_LINKABLE))
+ if (inode->i_nlink == 0 && !(inode_state_read_unlocked(inode) & I_LINKABLE))
error = -ENOENT;
else if (max_links && inode->i_nlink >= max_links)
error = -EMLINK;
@@ -4854,9 +4854,9 @@ int vfs_link(struct dentry *old_dentry, struct mnt_idmap *idmap,
error = dir->i_op->link(old_dentry, dir, new_dentry);
}
- if (!error && (inode->i_state & I_LINKABLE)) {
+ if (!error && (inode_state_read_unlocked(inode) & I_LINKABLE)) {
spin_lock(&inode->i_lock);
- inode->i_state &= ~I_LINKABLE;
+ inode_state_del(inode, I_LINKABLE);
spin_unlock(&inode->i_lock);
}
inode_unlock(inode);
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 46bfc543f946..8e504b40fb39 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -52,7 +52,7 @@ static void fsnotify_unmount_inodes(struct super_block *sb)
* the inode cannot have any associated watches.
*/
spin_lock(&inode->i_lock);
- if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
+ if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) {
spin_unlock(&inode->i_lock);
continue;
}
diff --git a/fs/pipe.c b/fs/pipe.c
index 731622d0738d..81ad740af963 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -906,7 +906,7 @@ static struct inode * get_pipe_inode(void)
* list because "mark_inode_dirty()" will think
* that it already _is_ on the dirty list.
*/
- inode->i_state = I_DIRTY;
+ inode_state_set_unchecked(inode, I_DIRTY);
inode->i_mode = S_IFIFO | S_IRUSR | S_IWUSR;
inode->i_uid = current_fsuid();
inode->i_gid = current_fsgid();
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index df4a9b348769..beb3d82a2915 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1030,7 +1030,7 @@ static int add_dquot_ref(struct super_block *sb, int type)
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
spin_lock(&inode->i_lock);
- if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
+ if ((inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) ||
!atomic_read(&inode->i_writecount) ||
!dqinit_needed(inode, type)) {
spin_unlock(&inode->i_lock);
diff --git a/fs/sync.c b/fs/sync.c
index 2955cd4c77a3..f392ab921fc9 100644
--- a/fs/sync.c
+++ b/fs/sync.c
@@ -182,7 +182,7 @@ int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync)
if (!file->f_op->fsync)
return -EINVAL;
- if (!datasync && (inode->i_state & I_DIRTY_TIME))
+ if (!datasync && (inode_state_read_unlocked(inode) & I_DIRTY_TIME))
mark_inode_dirty_sync(inode);
return file->f_op->fsync(file, start, end, datasync);
}
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index e721148c95d0..3291b492c5e3 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -289,10 +289,11 @@ unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie)
rcu_read_lock();
/*
- * Paired with store_release in inode_switch_wbs_work_fn() and
+ * Paired with a release fence in inode_do_switch_wbs() and
* ensures that we see the new wb if we see cleared I_WB_SWITCH.
*/
- cookie->locked = smp_load_acquire(&inode->i_state) & I_WB_SWITCH;
+ cookie->locked = inode_state_read_unlocked(inode) & I_WB_SWITCH;
+ smp_rmb();
if (unlikely(cookie->locked))
xa_lock_irqsave(&inode->i_mapping->i_pages, cookie->flags);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c4fd010cf5bf..f933d181a40e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -756,7 +756,7 @@ enum inode_state_bits {
/* reserved wait address bit 3 */
};
-enum inode_state_flags_t {
+enum inode_state_flags_enum {
I_NEW = (1U << __I_NEW),
I_SYNC = (1U << __I_SYNC),
I_LRU_ISOLATING = (1U << __I_LRU_ISOLATING),
@@ -840,7 +840,7 @@ struct inode {
#endif
/* Misc */
- enum inode_state_flags_t i_state;
+ enum inode_state_flags_enum i_state;
/* 32-bit hole */
struct rw_semaphore i_rwsem;
@@ -899,6 +899,43 @@ struct inode {
void *i_private; /* fs or device private pointer */
} __randomize_layout;
+
+/*
+ * i_state handling
+ *
+ * We hide all of it behind helpers so that we can validate consumers.
+ */
+static inline enum inode_state_flags_enum inode_state_read(struct inode *inode)
+{
+ lockdep_assert_held(&inode->i_lock);
+ return inode->i_state;
+}
+
+static inline enum inode_state_flags_enum inode_state_read_unlocked(struct inode *inode)
+{
+ return READ_ONCE(inode->i_state);
+}
+
+static inline void inode_state_add(struct inode *inode,
+ enum inode_state_flags_enum newflags)
+{
+ lockdep_assert_held(&inode->i_lock);
+ WRITE_ONCE(inode->i_state, inode->i_state | newflags);
+}
+
+static inline void inode_state_del(struct inode *inode,
+ enum inode_state_flags_enum rmflags)
+{
+ lockdep_assert_held(&inode->i_lock);
+ WRITE_ONCE(inode->i_state, inode->i_state & ~rmflags);
+}
+
+static inline void inode_state_set_unchecked(struct inode *inode,
+ enum inode_state_flags_enum newflags)
+{
+ WRITE_ONCE(inode->i_state, newflags);
+}
+
static inline void inode_set_cached_link(struct inode *inode, char *link, int linklen)
{
VFS_WARN_ON_INODE(strlen(link) != linklen, inode);
@@ -2627,7 +2664,7 @@ static inline int icount_read(const struct inode *inode)
*/
static inline bool inode_is_dirtytime_only(struct inode *inode)
{
- return (inode->i_state & (I_DIRTY_TIME | I_NEW |
+ return (inode_state_read_unlocked(inode) & (I_DIRTY_TIME | I_NEW |
I_FREEING | I_WILL_FREE)) == I_DIRTY_TIME;
}
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index a2848d731a46..cd5d0db09639 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -193,7 +193,7 @@ void inode_io_list_del(struct inode *inode);
static inline void wait_on_inode(struct inode *inode)
{
wait_var_event(inode_state_wait_address(inode, __I_NEW),
- !(READ_ONCE(inode->i_state) & I_NEW));
+ !(inode_state_read_unlocked(inode) & I_NEW));
}
#ifdef CONFIG_CGROUP_WRITEBACK
@@ -234,7 +234,7 @@ static inline void inode_attach_wb(struct inode *inode, struct folio *folio)
static inline void inode_detach_wb(struct inode *inode)
{
if (inode->i_wb) {
- WARN_ON_ONCE(!(inode->i_state & I_CLEAR));
+ WARN_ON_ONCE(!(inode_state_read_unlocked(inode) & I_CLEAR));
wb_put(inode->i_wb);
inode->i_wb = NULL;
}
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index 1e23919c0da9..82f6b7f26c29 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -120,7 +120,7 @@ DECLARE_EVENT_CLASS(writeback_dirty_inode_template,
/* may be called for files on pseudo FSes w/ unregistered bdi */
strscpy_pad(__entry->name, bdi_dev_name(bdi), 32);
__entry->ino = inode->i_ino;
- __entry->state = inode->i_state;
+ __entry->state = inode_state_read_unlocked(inode);
__entry->flags = flags;
),
@@ -719,7 +719,7 @@ TRACE_EVENT(writeback_sb_inodes_requeue,
strscpy_pad(__entry->name,
bdi_dev_name(inode_to_bdi(inode)), 32);
__entry->ino = inode->i_ino;
- __entry->state = inode->i_state;
+ __entry->state = inode_state_read_unlocked(inode);
__entry->dirtied_when = inode->dirtied_when;
__entry->cgroup_ino = __trace_wb_assign_cgroup(inode_to_wb(inode));
),
@@ -758,7 +758,7 @@ DECLARE_EVENT_CLASS(writeback_single_inode_template,
strscpy_pad(__entry->name,
bdi_dev_name(inode_to_bdi(inode)), 32);
__entry->ino = inode->i_ino;
- __entry->state = inode->i_state;
+ __entry->state = inode_state_read_unlocked(inode);
__entry->dirtied_when = inode->dirtied_when;
__entry->writeback_index = inode->i_mapping->writeback_index;
__entry->nr_to_write = nr_to_write;
@@ -810,7 +810,7 @@ DECLARE_EVENT_CLASS(writeback_inode_template,
TP_fast_assign(
__entry->dev = inode->i_sb->s_dev;
__entry->ino = inode->i_ino;
- __entry->state = inode->i_state;
+ __entry->state = inode_state_read_unlocked(inode);
__entry->mode = inode->i_mode;
__entry->dirtied_when = inode->dirtied_when;
),
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 0bade2c5aa1d..56c30d04971f 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1290,13 +1290,13 @@ static void hook_sb_delete(struct super_block *const sb)
*/
spin_lock(&inode->i_lock);
/*
- * Checks I_FREEING and I_WILL_FREE to protect against a race
- * condition when release_inode() just called iput(), which
- * could lead to a NULL dereference of inode->security or a
- * second call to iput() for the same Landlock object. Also
- * checks I_NEW because such inode cannot be tied to an object.
+ * Checks I_FREEING to protect against a race condition when
+ * release_inode() just called iput(), which could lead to a
+ * NULL dereference of inode->security or a second call to
+ * iput() for the same Landlock object. Also checks I_NEW
+ * because such inode cannot be tied to an object.
*/
- if (inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) {
+ if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) {
spin_unlock(&inode->i_lock);
continue;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 03/10] bcachefs: use the new ->i_state accessors
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 01/10] fs: expand dump_inode() Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 02/10] fs: hide ->i_state handling behind accessors Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 04/10] btrfs: " Mateusz Guzik
` (5 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/bcachefs/fs.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c
index 687af0eea0c2..172685ced878 100644
--- a/fs/bcachefs/fs.c
+++ b/fs/bcachefs/fs.c
@@ -347,7 +347,7 @@ static struct bch_inode_info *bch2_inode_hash_find(struct bch_fs *c, struct btre
spin_unlock(&inode->v.i_lock);
return NULL;
}
- if ((inode->v.i_state & (I_FREEING|I_WILL_FREE))) {
+ if ((inode_state_read(&inode->v) & (I_FREEING|I_WILL_FREE))) {
if (!trans) {
__wait_on_freeing_inode(c, inode, inum);
} else {
@@ -411,7 +411,7 @@ static struct bch_inode_info *bch2_inode_hash_insert(struct bch_fs *c,
* only insert fully created inodes in the inode hash table. But
* discard_new_inode() expects it to be set...
*/
- inode->v.i_state |= I_NEW;
+ inode_state_set_unchecked(&inode->v, inode_state_read_unlocked(&inode->v) | I_NEW);
/*
* We don't want bch2_evict_inode() to delete the inode on disk,
* we just raced and had another inode in cache. Normally new
@@ -2224,8 +2224,8 @@ void bch2_evict_subvolume_inodes(struct bch_fs *c, snapshot_id_list *s)
if (!snapshot_list_has_id(s, inode->ei_inum.subvol))
continue;
- if (!(inode->v.i_state & I_DONTCACHE) &&
- !(inode->v.i_state & I_FREEING) &&
+ if (!(inode_state_read_unlocked(&inode->v) & I_DONTCACHE) &&
+ !(inode_state_read_unlocked(&inode->v) & I_FREEING) &&
igrab(&inode->v)) {
this_pass_clean = false;
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 04/10] btrfs: use the new ->i_state accessors
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
` (2 preceding siblings ...)
2025-09-09 9:13 ` [PATCH v2 03/10] bcachefs: use the new ->i_state accessors Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 05/10] ext4: " Mateusz Guzik
` (4 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/btrfs/inode.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5bcd8e25fa78..1cfdba42f072 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3856,7 +3856,7 @@ static int btrfs_add_inode_to_root(struct btrfs_inode *inode, bool prealloc)
ASSERT(ret != -ENOMEM);
return ret;
} else if (existing) {
- WARN_ON(!(existing->vfs_inode.i_state & (I_WILL_FREE | I_FREEING)));
+ WARN_ON(!(inode_state_read_unlocked(&existing->vfs_inode) & (I_WILL_FREE | I_FREEING)));
}
return 0;
@@ -5317,7 +5317,7 @@ static void evict_inode_truncate_pages(struct inode *inode)
struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree;
struct rb_node *node;
- ASSERT(inode->i_state & I_FREEING);
+ ASSERT(inode_state_read_unlocked(inode) & I_FREEING);
truncate_inode_pages_final(&inode->i_data);
btrfs_drop_extent_map_range(BTRFS_I(inode), 0, (u64)-1, false);
@@ -5745,7 +5745,7 @@ struct btrfs_inode *btrfs_iget_path(u64 ino, struct btrfs_root *root,
if (!inode)
return ERR_PTR(-ENOMEM);
- if (!(inode->vfs_inode.i_state & I_NEW))
+ if (!(inode_state_read_unlocked(&inode->vfs_inode) & I_NEW))
return inode;
ret = btrfs_read_locked_inode(inode, path);
@@ -5769,7 +5769,7 @@ struct btrfs_inode *btrfs_iget(u64 ino, struct btrfs_root *root)
if (!inode)
return ERR_PTR(-ENOMEM);
- if (!(inode->vfs_inode.i_state & I_NEW))
+ if (!(inode_state_read_unlocked(&inode->vfs_inode) & I_NEW))
return inode;
path = btrfs_alloc_path();
@@ -7435,7 +7435,7 @@ static void btrfs_invalidate_folio(struct folio *folio, size_t offset,
u64 page_start = folio_pos(folio);
u64 page_end = page_start + folio_size(folio) - 1;
u64 cur;
- int inode_evicting = inode->vfs_inode.i_state & I_FREEING;
+ int inode_evicting = inode_state_read_unlocked(&inode->vfs_inode) & I_FREEING;
/*
* We have folio locked so no new ordered extent can be created on this
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 05/10] ext4: use the new ->i_state accessors
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
` (3 preceding siblings ...)
2025-09-09 9:13 ` [PATCH v2 04/10] btrfs: " Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 07/10] ocfs2: " Mateusz Guzik
` (3 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/ext4/inode.c | 10 +++++-----
fs/ext4/orphan.c | 4 ++--
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ed54c4d0f2f9..55d87fa1c5af 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -425,7 +425,7 @@ void ext4_check_map_extents_env(struct inode *inode)
if (!S_ISREG(inode->i_mode) ||
IS_NOQUOTA(inode) || IS_VERITY(inode) ||
is_special_ino(inode->i_sb, inode->i_ino) ||
- (inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW)) ||
+ (inode_state_read_unlocked(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) ||
ext4_test_inode_flag(inode, EXT4_INODE_EA_INODE) ||
ext4_verity_in_progress(inode))
return;
@@ -3473,7 +3473,7 @@ static bool ext4_inode_datasync_dirty(struct inode *inode)
/* Any metadata buffers to write? */
if (!list_empty(&inode->i_mapping->i_private_list))
return true;
- return inode->i_state & I_DIRTY_DATASYNC;
+ return inode_state_read_unlocked(inode) & I_DIRTY_DATASYNC;
}
static void ext4_set_iomap(struct inode *inode, struct iomap *iomap,
@@ -4581,7 +4581,7 @@ int ext4_truncate(struct inode *inode)
* or it's a completely new inode. In those cases we might not
* have i_rwsem locked because it's not necessary.
*/
- if (!(inode->i_state & (I_NEW|I_FREEING)))
+ if (!(inode_state_read_unlocked(inode) & (I_NEW|I_FREEING)))
WARN_ON(!inode_is_locked(inode));
trace_ext4_truncate_enter(inode);
@@ -5239,7 +5239,7 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
- if (!(inode->i_state & I_NEW)) {
+ if (!(inode_state_read_unlocked(inode) & I_NEW)) {
ret = check_igot_inode(inode, flags, function, line);
if (ret) {
iput(inode);
@@ -5570,7 +5570,7 @@ static void __ext4_update_other_inode_time(struct super_block *sb,
if (inode_is_dirtytime_only(inode)) {
struct ext4_inode_info *ei = EXT4_I(inode);
- inode->i_state &= ~I_DIRTY_TIME;
+ inode_state_del(inode, I_DIRTY_TIME);
spin_unlock(&inode->i_lock);
spin_lock(&ei->i_raw_lock);
diff --git a/fs/ext4/orphan.c b/fs/ext4/orphan.c
index 7c7f792ad6ab..70444810bd7a 100644
--- a/fs/ext4/orphan.c
+++ b/fs/ext4/orphan.c
@@ -107,7 +107,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
if (!sbi->s_journal || is_bad_inode(inode))
return 0;
- WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
+ WARN_ON_ONCE(!(inode_state_read_unlocked(inode) & (I_NEW | I_FREEING)) &&
!inode_is_locked(inode));
/*
* Inode orphaned in orphan file or in orphan list?
@@ -236,7 +236,7 @@ int ext4_orphan_del(handle_t *handle, struct inode *inode)
if (!sbi->s_journal && !(sbi->s_mount_state & EXT4_ORPHAN_FS))
return 0;
- WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
+ WARN_ON_ONCE(!(inode_state_read_unlocked(inode) & (I_NEW | I_FREEING)) &&
!inode_is_locked(inode));
if (ext4_test_inode_state(inode, EXT4_STATE_ORPHAN_FILE))
return ext4_orphan_file_del(handle, inode);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 07/10] ocfs2: use the new ->i_state accessors
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
` (4 preceding siblings ...)
2025-09-09 9:13 ` [PATCH v2 05/10] ext4: " Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 08/10] ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage Mateusz Guzik
` (2 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/ocfs2/dlmglue.c | 2 +-
fs/ocfs2/inode.c | 10 +++++-----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c
index 92a6149da9c1..b3b7954926d6 100644
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -2487,7 +2487,7 @@ int ocfs2_inode_lock_full_nested(struct inode *inode,
* which hasn't been populated yet, so clear the refresh flag
* and let the caller handle it.
*/
- if (inode->i_state & I_NEW) {
+ if (inode_state_read_unlocked(inode) & I_NEW) {
status = 0;
if (lockres)
ocfs2_complete_lock_res_refresh(lockres, 0);
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 14bf440ea4df..02312d4fbd7b 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -152,8 +152,8 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
mlog_errno(PTR_ERR(inode));
goto bail;
}
- trace_ocfs2_iget5_locked(inode->i_state);
- if (inode->i_state & I_NEW) {
+ trace_ocfs2_iget5_locked(inode_state_read_unlocked(inode));
+ if (inode_state_read_unlocked(inode) & I_NEW) {
rc = ocfs2_read_locked_inode(inode, &args);
unlock_new_inode(inode);
}
@@ -1307,12 +1307,12 @@ int ocfs2_drop_inode(struct inode *inode)
inode->i_nlink, oi->ip_flags);
assert_spin_locked(&inode->i_lock);
- inode->i_state |= I_WILL_FREE;
+ inode_state_add(inode, I_WILL_FREE);
spin_unlock(&inode->i_lock);
write_inode_now(inode, 1);
spin_lock(&inode->i_lock);
- WARN_ON(inode->i_state & I_NEW);
- inode->i_state &= ~I_WILL_FREE;
+ WARN_ON(inode_state_read(inode) & I_NEW);
+ inode_state_del(inode, I_WILL_FREE);
return 1;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 08/10] ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
` (5 preceding siblings ...)
2025-09-09 9:13 ` [PATCH v2 07/10] ocfs2: " Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 09/10] fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to writeback Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 10/10] fs: retire the I_WILL_FREE flag Mateusz Guzik
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
This postpones the writeout to ocfs2_evict_inode(), which I'm told is
fine (tm).
This is in preparation for I_WILL_FREE flag removal.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/ocfs2/inode.c | 23 ++---------------------
fs/ocfs2/inode.h | 1 -
fs/ocfs2/ocfs2_trace.h | 2 --
fs/ocfs2/super.c | 2 +-
4 files changed, 3 insertions(+), 25 deletions(-)
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 02312d4fbd7b..671c2303019b 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -1287,6 +1287,8 @@ static void ocfs2_clear_inode(struct inode *inode)
void ocfs2_evict_inode(struct inode *inode)
{
+ write_inode_now(inode, 1);
+
if (!inode->i_nlink ||
(OCFS2_I(inode)->ip_flags & OCFS2_INODE_MAYBE_ORPHANED)) {
ocfs2_delete_inode(inode);
@@ -1296,27 +1298,6 @@ void ocfs2_evict_inode(struct inode *inode)
ocfs2_clear_inode(inode);
}
-/* Called under inode_lock, with no more references on the
- * struct inode, so it's safe here to check the flags field
- * and to manipulate i_nlink without any other locks. */
-int ocfs2_drop_inode(struct inode *inode)
-{
- struct ocfs2_inode_info *oi = OCFS2_I(inode);
-
- trace_ocfs2_drop_inode((unsigned long long)oi->ip_blkno,
- inode->i_nlink, oi->ip_flags);
-
- assert_spin_locked(&inode->i_lock);
- inode_state_add(inode, I_WILL_FREE);
- spin_unlock(&inode->i_lock);
- write_inode_now(inode, 1);
- spin_lock(&inode->i_lock);
- WARN_ON(inode_state_read(inode) & I_NEW);
- inode_state_del(inode, I_WILL_FREE);
-
- return 1;
-}
-
/*
* This is called from our getattr.
*/
diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
index accf03d4765e..07bd838e7843 100644
--- a/fs/ocfs2/inode.h
+++ b/fs/ocfs2/inode.h
@@ -116,7 +116,6 @@ static inline struct ocfs2_caching_info *INODE_CACHE(struct inode *inode)
}
void ocfs2_evict_inode(struct inode *inode);
-int ocfs2_drop_inode(struct inode *inode);
/* Flags for ocfs2_iget() */
#define OCFS2_FI_FLAG_SYSFILE 0x1
diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h
index 54ed1495de9a..4b32fb5658ad 100644
--- a/fs/ocfs2/ocfs2_trace.h
+++ b/fs/ocfs2/ocfs2_trace.h
@@ -1569,8 +1569,6 @@ DEFINE_OCFS2_ULL_ULL_UINT_EVENT(ocfs2_delete_inode);
DEFINE_OCFS2_ULL_UINT_EVENT(ocfs2_clear_inode);
-DEFINE_OCFS2_ULL_UINT_UINT_EVENT(ocfs2_drop_inode);
-
TRACE_EVENT(ocfs2_inode_revalidate,
TP_PROTO(void *inode, unsigned long long ino,
unsigned int flags),
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 53daa4482406..e4b0d25f4869 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -129,7 +129,7 @@ static const struct super_operations ocfs2_sops = {
.statfs = ocfs2_statfs,
.alloc_inode = ocfs2_alloc_inode,
.free_inode = ocfs2_free_inode,
- .drop_inode = ocfs2_drop_inode,
+ .drop_inode = generic_delete_inode,
.evict_inode = ocfs2_evict_inode,
.sync_fs = ocfs2_sync_fs,
.put_super = ocfs2_put_super,
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 09/10] fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to writeback
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
` (6 preceding siblings ...)
2025-09-09 9:13 ` [PATCH v2 08/10] ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
2025-09-09 18:32 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 10/10] fs: retire the I_WILL_FREE flag Mateusz Guzik
8 siblings, 1 reply; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
This is in preparation for I_WILL_FREE flag removal.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
fs/inode.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index 20f36d54348c..9c695339ec3e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1880,18 +1880,17 @@ static void iput_final(struct inode *inode)
return;
}
+ inode_state_add(inode, I_FREEING);
+
if (!drop) {
- inode_state_add(inode, I_WILL_FREE);
spin_unlock(&inode->i_lock);
write_inode_now(inode, 1);
spin_lock(&inode->i_lock);
- inode_state_del(inode, I_WILL_FREE);
WARN_ON(inode_state_read(inode) & I_NEW);
}
- inode_state_add(inode, I_FREEING);
if (!list_empty(&inode->i_lru))
inode_lru_list_del(inode);
spin_unlock(&inode->i_lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 10/10] fs: retire the I_WILL_FREE flag
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
` (7 preceding siblings ...)
2025-09-09 9:13 ` [PATCH v2 09/10] fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to writeback Mateusz Guzik
@ 2025-09-09 9:13 ` Mateusz Guzik
8 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 9:13 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel,
Mateusz Guzik
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
block/bdev.c | 2 +-
fs/bcachefs/fs.c | 2 +-
fs/btrfs/inode.c | 2 +-
fs/crypto/keyring.c | 2 +-
fs/drop_caches.c | 2 +-
fs/ext4/inode.c | 4 ++--
fs/fs-writeback.c | 20 +++++++++----------
fs/gfs2/ops_fstype.c | 2 +-
fs/inode.c | 18 ++++++++---------
fs/notify/fsnotify.c | 8 ++++----
fs/quota/dquot.c | 2 +-
fs/xfs/scrub/common.c | 3 +--
include/linux/fs.h | 34 +++++++++++++-------------------
include/trace/events/writeback.h | 3 +--
security/landlock/fs.c | 2 +-
15 files changed, 48 insertions(+), 58 deletions(-)
diff --git a/block/bdev.c b/block/bdev.c
index 77f04042ac67..fb280f7b04c0 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -1265,7 +1265,7 @@ void sync_bdevs(bool wait)
struct block_device *bdev;
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW) ||
+ if (inode_state_read(inode) & (I_NEW | I_FREEING) ||
mapping->nrpages == 0) {
spin_unlock(&inode->i_lock);
continue;
diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c
index 172685ced878..1aeacc9a945e 100644
--- a/fs/bcachefs/fs.c
+++ b/fs/bcachefs/fs.c
@@ -347,7 +347,7 @@ static struct bch_inode_info *bch2_inode_hash_find(struct bch_fs *c, struct btre
spin_unlock(&inode->v.i_lock);
return NULL;
}
- if ((inode_state_read(&inode->v) & (I_FREEING|I_WILL_FREE))) {
+ if (inode_state_read(&inode->v) & I_FREEING) {
if (!trans) {
__wait_on_freeing_inode(c, inode, inum);
} else {
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1cfdba42f072..5a3cb16a9420 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3856,7 +3856,7 @@ static int btrfs_add_inode_to_root(struct btrfs_inode *inode, bool prealloc)
ASSERT(ret != -ENOMEM);
return ret;
} else if (existing) {
- WARN_ON(!(inode_state_read_unlocked(&existing->vfs_inode) & (I_WILL_FREE | I_FREEING)));
+ WARN_ON(!(inode_state_read_unlocked(&existing->vfs_inode) & I_FREEING));
}
return 0;
diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
index 34beb60bc24e..b64726176218 100644
--- a/fs/crypto/keyring.c
+++ b/fs/crypto/keyring.c
@@ -957,7 +957,7 @@ static void evict_dentries_for_decrypted_inodes(struct fscrypt_master_key *mk)
list_for_each_entry(ci, &mk->mk_decrypted_inodes, ci_master_key_link) {
inode = ci->ci_inode;
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING)) {
spin_unlock(&inode->i_lock);
continue;
}
diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index 73175ac2fe92..687795d7846b 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -28,7 +28,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
* inodes without pages but we deliberately won't in case
* we need to reschedule to avoid softlockups.
*/
- if ((inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) ||
+ if ((inode_state_read(inode) & (I_NEW | I_FREEING)) ||
(mapping_empty(inode->i_mapping) && !need_resched())) {
spin_unlock(&inode->i_lock);
continue;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 55d87fa1c5af..db9a24ca7192 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -425,7 +425,7 @@ void ext4_check_map_extents_env(struct inode *inode)
if (!S_ISREG(inode->i_mode) ||
IS_NOQUOTA(inode) || IS_VERITY(inode) ||
is_special_ino(inode->i_sb, inode->i_ino) ||
- (inode_state_read_unlocked(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) ||
+ (inode_state_read_unlocked(inode) & (I_NEW | I_FREEING)) ||
ext4_test_inode_flag(inode, EXT4_INODE_EA_INODE) ||
ext4_verity_in_progress(inode))
return;
@@ -4581,7 +4581,7 @@ int ext4_truncate(struct inode *inode)
* or it's a completely new inode. In those cases we might not
* have i_rwsem locked because it's not necessary.
*/
- if (!(inode_state_read_unlocked(inode) & (I_NEW|I_FREEING)))
+ if (!(inode_state_read_unlocked(inode) & (I_NEW | I_FREEING)))
WARN_ON(!inode_is_locked(inode));
trace_ext4_truncate_enter(inode);
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 67c2157a7d21..a3fb9004b581 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -405,10 +405,10 @@ static bool inode_do_switch_wbs(struct inode *inode,
xa_lock_irq(&mapping->i_pages);
/*
- * Once I_FREEING or I_WILL_FREE are visible under i_lock, the eviction
- * path owns the inode and we shouldn't modify ->i_io_list.
+ * Once I_FREEING is visible under i_lock, the eviction path owns the
+ * inode and we shouldn't modify ->i_io_list.
*/
- if (unlikely(inode_state_read(inode) & (I_FREEING | I_WILL_FREE)))
+ if (unlikely(inode_state_read(inode) & I_FREEING))
goto skip_switch;
trace_inode_switch_wbs(inode, old_wb, new_wb);
@@ -561,7 +561,7 @@ static bool inode_prepare_wbs_switch(struct inode *inode,
/* while holding I_WB_SWITCH, no one else can update the association */
spin_lock(&inode->i_lock);
if (!(inode->i_sb->s_flags & SB_ACTIVE) ||
- inode_state_read(inode) & (I_WB_SWITCH | I_FREEING | I_WILL_FREE) ||
+ inode_state_read(inode) & (I_WB_SWITCH | I_FREEING) ||
inode_to_wb(inode) == new_wb) {
spin_unlock(&inode->i_lock);
return false;
@@ -1759,7 +1759,7 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
* whether it is a data-integrity sync (%WB_SYNC_ALL) or not (%WB_SYNC_NONE).
*
* To prevent the inode from going away, either the caller must have a reference
- * to the inode, or the inode must have I_WILL_FREE or I_FREEING set.
+ * to the inode, or the inode must have I_FREEING set.
*/
static int writeback_single_inode(struct inode *inode,
struct writeback_control *wbc)
@@ -1769,9 +1769,7 @@ static int writeback_single_inode(struct inode *inode,
spin_lock(&inode->i_lock);
if (!icount_read(inode))
- WARN_ON(!(inode_state_read(inode) & (I_WILL_FREE|I_FREEING)));
- else
- WARN_ON(inode_state_read(inode) & I_WILL_FREE);
+ WARN_ON(!(inode_state_read(inode) & I_FREEING));
if (inode_state_read(inode) & I_SYNC) {
/*
@@ -1929,7 +1927,7 @@ static long writeback_sb_inodes(struct super_block *sb,
* kind writeout is handled by the freer.
*/
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_NEW | I_FREEING | I_WILL_FREE)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING)) {
redirty_tail_locked(inode, wb);
spin_unlock(&inode->i_lock);
continue;
@@ -2697,7 +2695,7 @@ static void wait_sb_inodes(struct super_block *sb)
spin_unlock_irq(&sb->s_inode_wblist_lock);
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING)) {
spin_unlock(&inode->i_lock);
spin_lock_irq(&sb->s_inode_wblist_lock);
@@ -2846,7 +2844,7 @@ EXPORT_SYMBOL(sync_inodes_sb);
* This function commits an inode to disk immediately if it is dirty. This is
* primarily needed by knfsd.
*
- * The caller must either have a ref on the inode or must have set I_WILL_FREE.
+ * The caller must either have a ref on the inode or must have set I_FREEING.
*/
int write_inode_now(struct inode *inode, int sync)
{
diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index 3ccd902ec5ba..240894adb34d 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1749,7 +1749,7 @@ static void gfs2_evict_inodes(struct super_block *sb)
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
spin_lock(&inode->i_lock);
- if ((inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) &&
+ if ((inode_state_read(inode) & (I_NEW | I_FREEING)) &&
!need_resched()) {
spin_unlock(&inode->i_lock);
continue;
diff --git a/fs/inode.c b/fs/inode.c
index 9c695339ec3e..6a69de0bf7bd 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -532,7 +532,7 @@ EXPORT_SYMBOL(ihold);
static void __inode_add_lru(struct inode *inode, bool rotate)
{
- if (inode_state_read(inode) & (I_DIRTY_ALL | I_SYNC | I_FREEING | I_WILL_FREE))
+ if (inode_state_read(inode) & (I_DIRTY_ALL | I_SYNC | I_FREEING))
return;
if (icount_read(inode))
return;
@@ -577,7 +577,7 @@ static void inode_lru_list_del(struct inode *inode)
static void inode_pin_lru_isolating(struct inode *inode)
{
lockdep_assert_held(&inode->i_lock);
- WARN_ON(inode_state_read(inode) & (I_LRU_ISOLATING | I_FREEING | I_WILL_FREE));
+ WARN_ON(inode_state_read(inode) & (I_LRU_ISOLATING | I_FREEING));
inode_state_add(inode, I_LRU_ISOLATING);
}
@@ -879,7 +879,7 @@ void evict_inodes(struct super_block *sb)
spin_unlock(&inode->i_lock);
continue;
}
- if (inode_state_read(inode) & (I_NEW | I_FREEING | I_WILL_FREE)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING)) {
spin_unlock(&inode->i_lock);
continue;
}
@@ -1025,7 +1025,7 @@ static struct inode *find_inode(struct super_block *sb,
if (!test(inode, data))
continue;
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE)) {
+ if (inode_state_read(inode) & I_FREEING) {
__wait_on_freeing_inode(inode, is_inode_hash_locked);
goto repeat;
}
@@ -1066,7 +1066,7 @@ static struct inode *find_inode_fast(struct super_block *sb,
if (inode->i_sb != sb)
continue;
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE)) {
+ if (inode_state_read(inode) & I_FREEING) {
__wait_on_freeing_inode(inode, is_inode_hash_locked);
goto repeat;
}
@@ -1538,7 +1538,7 @@ EXPORT_SYMBOL(iunique);
struct inode *igrab(struct inode *inode)
{
spin_lock(&inode->i_lock);
- if (!(inode_state_read(inode) & (I_FREEING|I_WILL_FREE))) {
+ if (!(inode_state_read(inode) & I_FREEING)) {
__iget(inode);
spin_unlock(&inode->i_lock);
} else {
@@ -1728,7 +1728,7 @@ struct inode *find_inode_rcu(struct super_block *sb, unsigned long hashval,
hlist_for_each_entry_rcu(inode, head, i_hash) {
if (inode->i_sb == sb &&
- !(inode_state_read_unlocked(inode) & (I_FREEING | I_WILL_FREE)) &&
+ !(inode_state_read_unlocked(inode) & I_FREEING) &&
test(inode, data))
return inode;
}
@@ -1767,7 +1767,7 @@ struct inode *find_inode_by_ino_rcu(struct super_block *sb,
hlist_for_each_entry_rcu(inode, head, i_hash) {
if (inode->i_ino == ino &&
inode->i_sb == sb &&
- !(inode_state_read_unlocked(inode) & (I_FREEING | I_WILL_FREE)))
+ !(inode_state_read_unlocked(inode) & I_FREEING))
return inode;
}
return NULL;
@@ -1789,7 +1789,7 @@ int insert_inode_locked(struct inode *inode)
if (old->i_sb != sb)
continue;
spin_lock(&old->i_lock);
- if (inode_state_read(old) & (I_FREEING|I_WILL_FREE)) {
+ if (inode_state_read(old) & I_FREEING) {
spin_unlock(&old->i_lock);
continue;
}
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 8e504b40fb39..82b0e97454db 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -47,12 +47,12 @@ static void fsnotify_unmount_inodes(struct super_block *sb)
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
/*
- * We cannot __iget() an inode in state I_FREEING,
- * I_WILL_FREE, or I_NEW which is fine because by that point
- * the inode cannot have any associated watches.
+ * We cannot __iget() an inode in state I_NEW or I_FREEING
+ * which is fine because by that point the inode cannot have
+ * any associated watches.
*/
spin_lock(&inode->i_lock);
- if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING)) {
spin_unlock(&inode->i_lock);
continue;
}
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index beb3d82a2915..bd5b1d10a52a 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -1030,7 +1030,7 @@ static int add_dquot_ref(struct super_block *sb, int type)
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
spin_lock(&inode->i_lock);
- if ((inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW)) ||
+ if ((inode_state_read(inode) & (I_NEW | I_FREEING)) ||
!atomic_read(&inode->i_writecount) ||
!dqinit_needed(inode, type)) {
spin_unlock(&inode->i_lock);
diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 2ef7742be7d3..e678f944206f 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -1086,8 +1086,7 @@ xchk_install_handle_inode(
/*
* Install an already-referenced inode for scrubbing. Get our own reference to
- * the inode to make disposal simpler. The inode must not be in I_FREEING or
- * I_WILL_FREE state!
+ * the inode to make disposal simpler. The inode must not have I_FREEING set.
*/
int
xchk_install_live_inode(
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f933d181a40e..40c4c0e8dd45 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -671,8 +671,8 @@ is_uncached_acl(struct posix_acl *acl)
* I_DIRTY_DATASYNC, I_DIRTY_PAGES, and I_DIRTY_TIME.
*
* Four bits define the lifetime of an inode. Initially, inodes are I_NEW,
- * until that flag is cleared. I_WILL_FREE, I_FREEING and I_CLEAR are set at
- * various stages of removing an inode.
+ * until that flag is cleared. I_FREEING and I_CLEAR are set at various stages
+ * of removing an inode.
*
* Two bits are used for locking and completion notification, I_NEW and I_SYNC.
*
@@ -696,24 +696,21 @@ is_uncached_acl(struct posix_acl *acl)
* New inodes set I_NEW. If two processes both create
* the same inode, one of them will release its inode and
* wait for I_NEW to be released before returning.
- * Inodes in I_WILL_FREE, I_FREEING or I_CLEAR state can
- * also cause waiting on I_NEW, without I_NEW actually
- * being set. find_inode() uses this to prevent returning
- * nearly-dead inodes.
- * I_WILL_FREE Must be set when calling write_inode_now() if i_count
- * is zero. I_FREEING must be set when I_WILL_FREE is
- * cleared.
+ * Inodes in I_FREEING or I_CLEAR state can also cause
+ * waiting on I_NEW, without I_NEW actually being set.
+ * find_inode() uses this to prevent returning nearly-dead
+ * inodes.
* I_FREEING Set when inode is about to be freed but still has dirty
* pages or buffers attached or the inode itself is still
* dirty.
* I_CLEAR Added by clear_inode(). In this state the inode is
* clean and can be destroyed. Inode keeps I_FREEING.
*
- * Inodes that are I_WILL_FREE, I_FREEING or I_CLEAR are
- * prohibited for many purposes. iget() must wait for
- * the inode to be completely released, then create it
- * anew. Other functions will just ignore such inodes,
- * if appropriate. I_NEW is used for waiting.
+ * Inodes that are I_FREEING or I_CLEAR are prohibited for
+ * many purposes. iget() must wait for the inode to be
+ * completely released, then create it anew. Other
+ * functions will just ignore such inodes, if appropriate.
+ * I_NEW is used for waiting.
*
* I_SYNC Writeback of inode is running. The bit is set during
* data writeback, and cleared with a wakeup on the bit
@@ -743,8 +740,6 @@ is_uncached_acl(struct posix_acl *acl)
* I_LRU_ISOLATING Inode is pinned being isolated from LRU without holding
* i_count.
*
- * Q: What is the difference between I_WILL_FREE and I_FREEING?
- *
* __I_{SYNC,NEW,LRU_ISOLATING} are used to derive unique addresses to wait
* upon. There's one free address left.
*/
@@ -764,7 +759,7 @@ enum inode_state_flags_enum {
I_DIRTY_SYNC = (1U << 4),
I_DIRTY_DATASYNC = (1U << 5),
I_DIRTY_PAGES = (1U << 6),
- I_WILL_FREE = (1U << 7),
+ I_PINNING_NETFS_WB = (1U << 7),
I_FREEING = (1U << 8),
I_CLEAR = (1U << 9),
I_REFERENCED = (1U << 10),
@@ -775,7 +770,6 @@ enum inode_state_flags_enum {
I_CREATING = (1U << 15),
I_DONTCACHE = (1U << 16),
I_SYNC_QUEUED = (1U << 17),
- I_PINNING_NETFS_WB = (1U << 18)
};
#define I_DIRTY_INODE (I_DIRTY_SYNC | I_DIRTY_DATASYNC)
@@ -2664,8 +2658,8 @@ static inline int icount_read(const struct inode *inode)
*/
static inline bool inode_is_dirtytime_only(struct inode *inode)
{
- return (inode_state_read_unlocked(inode) & (I_DIRTY_TIME | I_NEW |
- I_FREEING | I_WILL_FREE)) == I_DIRTY_TIME;
+ return (inode_state_read_unlocked(inode) &
+ (I_DIRTY_TIME | I_NEW | I_FREEING)) == I_DIRTY_TIME;
}
extern void inc_nlink(struct inode *inode);
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index 82f6b7f26c29..1c6700011f08 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -15,7 +15,7 @@
{I_DIRTY_DATASYNC, "I_DIRTY_DATASYNC"}, \
{I_DIRTY_PAGES, "I_DIRTY_PAGES"}, \
{I_NEW, "I_NEW"}, \
- {I_WILL_FREE, "I_WILL_FREE"}, \
+ {I_PINNING_NETFS_WB, "I_PINNING_NETFS_WB"}, \
{I_FREEING, "I_FREEING"}, \
{I_CLEAR, "I_CLEAR"}, \
{I_SYNC, "I_SYNC"}, \
@@ -27,7 +27,6 @@
{I_CREATING, "I_CREATING"}, \
{I_DONTCACHE, "I_DONTCACHE"}, \
{I_SYNC_QUEUED, "I_SYNC_QUEUED"}, \
- {I_PINNING_NETFS_WB, "I_PINNING_NETFS_WB"}, \
{I_LRU_ISOLATING, "I_LRU_ISOLATING"} \
)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 56c30d04971f..4135c11ac939 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1296,7 +1296,7 @@ static void hook_sb_delete(struct super_block *const sb)
* iput() for the same Landlock object. Also checks I_NEW
* because such inode cannot be tied to an object.
*/
- if (inode_state_read(inode) & (I_FREEING | I_WILL_FREE | I_NEW)) {
+ if (inode_state_read(inode) & (I_NEW | I_FREEING)) {
spin_unlock(&inode->i_lock);
continue;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 09/10] fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to writeback
2025-09-09 9:13 ` [PATCH v2 09/10] fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to writeback Mateusz Guzik
@ 2025-09-09 18:32 ` Mateusz Guzik
0 siblings, 0 replies; 11+ messages in thread
From: Mateusz Guzik @ 2025-09-09 18:32 UTC (permalink / raw)
To: brauner
Cc: viro, jack, linux-kernel, linux-fsdevel, josef, kernel-team,
amir73il, linux-btrfs, linux-ext4, linux-xfs, ocfs2-devel
On Tue, Sep 9, 2025 at 11:14 AM Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> This is in preparation for I_WILL_FREE flag removal.
>
> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
> ---
> fs/inode.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/inode.c b/fs/inode.c
> index 20f36d54348c..9c695339ec3e 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1880,18 +1880,17 @@ static void iput_final(struct inode *inode)
> return;
> }
>
> + inode_state_add(inode, I_FREEING);
> +
> if (!drop) {
> - inode_state_add(inode, I_WILL_FREE);
> spin_unlock(&inode->i_lock);
>
> write_inode_now(inode, 1);
>
> spin_lock(&inode->i_lock);
> - inode_state_del(inode, I_WILL_FREE);
> WARN_ON(inode_state_read(inode) & I_NEW);
> }
>
> - inode_state_add(inode, I_FREEING);
> if (!list_empty(&inode->i_lru))
> inode_lru_list_del(inode);
> spin_unlock(&inode->i_lock);
> --
> 2.43.0
>
With a closer look I think this is buggy. write_inode_now() makes
assumptions that I_FREEING implies removal from the io list, but does
not assert on it.
So I'm going to post an updated patch which moves this write down
evict() after removal from the io list, and only issue the write
conditionally based on the drop parameter.
On top of that write_inode_now() is going to make a bunch of asserts
about the inode being clean after the write if I_FREEING is set.
--
Mateusz Guzik <mjguzik gmail.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-09-09 18:32 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-09 9:13 [WIP RFC PATCH v2 00/10] i_state accessors + I_WILL_FREE removal Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 01/10] fs: expand dump_inode() Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 02/10] fs: hide ->i_state handling behind accessors Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 03/10] bcachefs: use the new ->i_state accessors Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 04/10] btrfs: " Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 05/10] ext4: " Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 07/10] ocfs2: " Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 08/10] ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 09/10] fs: set I_FREEING instead of I_WILL_FREE in iput_final() prior to writeback Mateusz Guzik
2025-09-09 18:32 ` Mateusz Guzik
2025-09-09 9:13 ` [PATCH v2 10/10] fs: retire the I_WILL_FREE flag Mateusz Guzik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).