From: Dave Chinner <dgc@kernel.org>
To: Hans Holmberg <hans.holmberg@wdc.com>
Cc: Carlos Maiolino <cem@kernel.org>,
"Darrick J . Wong" <djwong@kernel.org>,
Dave Chinner <david@fromorbit.com>,
Christoph Hellwig <hch@lst.de>,
Damien Le Moal <dlemoal@kernel.org>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: return -ENOENT for unallocated inodes in xfs_imap_lookup
Date: Wed, 13 May 2026 23:43:51 +1000 [thread overview]
Message-ID: <agSAF-9YFa1ODMIG@dread> (raw)
In-Reply-To: <20260513063745.8067-1-hans.holmberg@wdc.com>
On Wed, May 13, 2026 at 08:37:45AM +0200, Hans Holmberg wrote:
> Under heavy garbage collection pressure from RocksDB workloads,
> filesystem shutdowns can occur in xfs_zone_gc_iter_irec when
> xfs_iget() returns -EINVAL.
>
> xfs_zone_gc_iter_irec expects -ENOENT when garbage collection races
> with file deletion,
xfs_iget() returns -ENOENT when a lookup races with an unlink on the
cache hot side. When the inode is not in cache, it cannot race with
file deletion.
If we miss the cache, it will go read the inode from disk and then
check the state of it before inserting it into the cache. If the
inode mode is zero on disk, then we raced with unlink and -ENOENT
will be returned.
IOWs, a plain xfs_iget() call will handle races with unlink...
> so that blocks belonging to deleted files can be
> skipped gracefully. Returning -EINVAL instead causes the GC code to
> treat this as a fatal error and forces a shutdown.
If it passes in XFS_IGET_UNTRUSTED to xfs_iget(), then it is saying
the inode number comes from an unknown source and that may be invalid.
Hence we do more rigorous and costly checks on the inode number
(like force a btree lookup) if it is not already validated and in
cache. If any of these validation checks fail we return EINVAL to
indicate it was an invalid inode number.
IOWs, xfs_iget(XFS_IGET_UNTRUSTED) callers need to handle -EINVAL,
because validity checking the inode number is what the caller -asked
it to do-.
If you are using inode number from a trusted source (i.e. internal
filesystem metadata like a directory data block or rmapbt) then you
don't need XFS_IGET_UNTRUSTED. The inode number is known to be good,
modulo races with unlink which xfs_iget() handles anyway...
-Dave.
--
Dave Chinner
dgc@kernel.org
prev parent reply other threads:[~2026-05-13 13:44 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 6:37 [PATCH] xfs: return -ENOENT for unallocated inodes in xfs_imap_lookup Hans Holmberg
2026-05-13 6:57 ` Christoph Hellwig
2026-05-13 7:31 ` Carlos Maiolino
2026-05-13 13:43 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agSAF-9YFa1ODMIG@dread \
--to=dgc@kernel.org \
--cc=cem@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=dlemoal@kernel.org \
--cc=hans.holmberg@wdc.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox