From: Omar Sandoval <osandov@osandov.com>
To: David Sterba <dsterba@suse.cz>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
Filipe Manana <fdmanana@suse.com>,
Leo Martins <loemra.dev@gmail.com>, Boris Burkov <boris@bur.io>
Subject: Re: [PATCH] btrfs: fix subvolume deletion lockup caused by inodes xarray race
Date: Thu, 28 Aug 2025 20:31:13 -0700 [thread overview]
Message-ID: <aLEfAc75VnQ5pcwu@telecaster> (raw)
In-Reply-To: <20250828232346.GD29826@twin.jikos.cz>
On Fri, Aug 29, 2025 at 01:23:46AM +0200, David Sterba wrote:
> On Tue, Aug 26, 2025 at 01:01:38PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > There is a race condition between inode eviction and inode caching that
> > can cause a live struct btrfs_inode to be missing from the root->inodes
> > xarray. Specifically, there is a window during evict() between the inode
> > being unhashed and deleted from the xarray. If btrfs_iget() is called
> > for the same inode in that window, it will be recreated and inserted
> > into the xarray, but then eviction will delete the new entry, leaving
> > nothing in the xarray:
> >
> > Thread 1 Thread 2
> > ---------------------------------------------------------------
> > evict()
> > remove_inode_hash()
> > btrfs_iget_path()
> > btrfs_iget_locked()
> > btrfs_read_locked_inode()
> > btrfs_add_inode_to_root()
> > destroy_inode()
> > btrfs_destroy_inode()
> > btrfs_del_inode_from_root()
> > __xa_erase
> >
> > In turn, this can cause issues for subvolume deletion. Specifically, if
> > an inode is in this lost state, and all other inodes are evicted, then
> > btrfs_del_inode_from_root() will call btrfs_add_dead_root() prematurely.
> > If the lost inode has a delayed_node attached to it, then when
> > btrfs_clean_one_deleted_snapshot() calls btrfs_kill_all_delayed_nodes(),
> > it will loop forever because the delayed_nodes xarray will never become
> > empty (unless memory pressure forces the inode out). We saw this
> > manifest as soft lockups in production.
> >
> > Fix it by only deleting the xarray entry if it matches the given inode
> > (using __xa_cmpxchg()).
> >
> > Fixes: 310b2f5d5a94 ("btrfs: use an xarray to track open inodes in a root")
>
> Is this correct Fixes commit? The xarray conversion was done in two
> steps, first the rbtree to xarray and then the locking got changed in
> e2844cce75c9e6 ("btrfs: remove inode_lock from struct btrfs_root and use
> xarray locks").
Yes, this is the correct Fixes commit. inode_lock didn't synchronize
with the VFS's inode hash at all, so it didn't help with this. The
reason it was okay with the rbtree was that each struct btrfs_inode had
its own rb_node, so deleting one wouldn't affect any others with the
same ino.
> The root->inode_lock is an outer lock, and with rbtree it worked. The
> new code relies on xa_lock, which could be released by xarray internally
> or elsewhere we need the atomic operations. Like in this case it's the
> __xa_cmpxchg which I find quite unobvious regarding the xarray API and
> who knows where something like it could be still missing. The simplicity
> of root->inode_lock feels a bit safer.
The documentation of __xa_cmpxchg() says:
* Context: Any context. Expects xa_lock to be held on entry. May
* release and reacquire xa_lock if @gfp flags permit.
And GFP_ATOMIC does not permit that.
next prev parent reply other threads:[~2025-08-29 3:31 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-26 20:01 [PATCH] btrfs: fix subvolume deletion lockup caused by inodes xarray race Omar Sandoval
2025-08-26 20:25 ` Josef Bacik
2025-08-26 21:19 ` Omar Sandoval
2025-08-27 8:45 ` Filipe Manana
2025-08-28 23:23 ` David Sterba
2025-08-29 3:31 ` Omar Sandoval [this message]
2025-08-29 14:23 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLEfAc75VnQ5pcwu@telecaster \
--to=osandov@osandov.com \
--cc=boris@bur.io \
--cc=dsterba@suse.cz \
--cc=fdmanana@suse.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=loemra.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.