From: "Darrick J. Wong" <djwong@kernel.org>
To: cem@kernel.org
Cc: hch@infradead.org, brauner@kernel.org, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/2] fserror: fix lockdep complaint when igrabbing inode
Date: Wed, 18 Feb 2026 22:15:46 -0800 [thread overview]
Message-ID: <20260219061546.GP6467@frogsfrogsfrogs> (raw)
In-Reply-To: <177148129564.716249.3069780698231701540.stgit@frogsfrogsfrogs>
On Wed, Feb 18, 2026 at 10:09:37PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Christoph Hellwig reported a lockdep splat in generic/108:
>
> ================================
> WARNING: inconsistent lock state
> 6.19.0+ #4827 Tainted: G N
> --------------------------------
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
> ffff88811ed1b140 (&sb->s_type->i_lock_key#33){?.+.}-{3:3}, at: igrab+0x1a/0xb0
> {HARDIRQ-ON-W} state was registered at:
> lock_acquire+0xca/0x2c0
> _raw_spin_lock+0x2e/0x40
> unlock_new_inode+0x2c/0xc0
> xfs_iget+0xcf4/0x1080
> xfs_trans_metafile_iget+0x3d/0x100
> xfs_metafile_iget+0x2b/0x50
> xfs_mount_setup_metadir+0x20/0x60
> xfs_mountfs+0x457/0xa60
> xfs_fs_fill_super+0x6b3/0xa90
> get_tree_bdev_flags+0x13c/0x1e0
> vfs_get_tree+0x27/0xe0
> vfs_cmd_create+0x54/0xe0
> __do_sys_fsconfig+0x309/0x620
> do_syscall_64+0x8b/0xf80
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> irq event stamp: 139080
> hardirqs last enabled at (139079): [<ffffffff813a923c>] do_idle+0x1ec/0x270
> hardirqs last disabled at (139080): [<ffffffff828a8d09>] common_interrupt+0x19/0xe0
> softirqs last enabled at (139032): [<ffffffff8134a853>] __irq_exit_rcu+0xc3/0x120
> softirqs last disabled at (139025): [<ffffffff8134a853>] __irq_exit_rcu+0xc3/0x120
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&sb->s_type->i_lock_key#33);
> <Interrupt>
> lock(&sb->s_type->i_lock_key#33);
>
> *** DEADLOCK ***
>
> 1 lock held by swapper/1/0:
> #0: ffff8881052c81a0 (&vblk->vqs[i].lock){-.-.}-{3:3}, at: virtblk_done+0x4b/0x110
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G N 6.19.0+ #4827 PREEMPT(full)
> Tainted: [N]=TEST
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
> Call Trace:
> <IRQ>
> dump_stack_lvl+0x5b/0x80
> print_usage_bug.part.0+0x22c/0x2c0
> mark_lock+0xa6f/0xe90
> __lock_acquire+0x10b6/0x25e0
> lock_acquire+0xca/0x2c0
> _raw_spin_lock+0x2e/0x40
> igrab+0x1a/0xb0
> fserror_report+0x135/0x260
> iomap_finish_ioend_buffered+0x170/0x210
> clone_endio+0x8f/0x1c0
> blk_update_request+0x1e4/0x4d0
> blk_mq_end_request+0x1b/0x100
> virtblk_done+0x6f/0x110
> vring_interrupt+0x59/0x80
> __handle_irq_event_percpu+0x8a/0x2e0
> handle_irq_event+0x33/0x70
> handle_edge_irq+0xdd/0x1e0
> __common_interrupt+0x6f/0x180
> common_interrupt+0xb7/0xe0
> </IRQ>
>
> It looks like the concern here is that inode::i_lock is sometimes taken
> in IRQ context, and sometimes it is held when going to IRQ context,
> though it's a little difficult to tell since I think this is a kernel
> from after the actual 6.19 release but before 7.0-rc1.
>
> Either way, we don't need to take i_lock, because filesystems should
> not report files to fserror if they're about to be freed or have not
> yet been exposed to other threads, because the resulting fsnotify report
> will be meaningless.
>
> Therefore, bump inode::i_count directly and clarify the preconditions on
> the inode being passed in.
...and now I realize that I got so hung up on email cc list composition
that I neglected to notice that I forgot to update the commit message
to say:
"Therefore, add the ioend to a queue and get an async worker to chug
through the error events from process context with no filesystem locks
already held."
Let's hope I got the paperwork right this time, all this friction to
amend minor mistakes are why I don't want to be here anymore. <grumble>
--D
> Link: https://lore.kernel.org/linux-fsdevel/aY7BndIgQg3ci_6s@infradead.org/
> Reported-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
> fs/iomap/ioend.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 46 insertions(+)
>
>
> diff --git a/fs/iomap/ioend.c b/fs/iomap/ioend.c
> index e4d57cb969f1bb..4d1ef8a2cee90b 100644
> --- a/fs/iomap/ioend.c
> +++ b/fs/iomap/ioend.c
> @@ -69,11 +69,57 @@ static u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
> return folio_count;
> }
>
> +static DEFINE_SPINLOCK(failed_ioend_lock);
> +static LIST_HEAD(failed_ioend_list);
> +
> +static void
> +iomap_fail_ioends(
> + struct work_struct *work)
> +{
> + struct iomap_ioend *ioend;
> + struct list_head tmp;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&failed_ioend_lock, flags);
> + list_replace_init(&failed_ioend_list, &tmp);
> + spin_unlock_irqrestore(&failed_ioend_lock, flags);
> +
> + while ((ioend = list_first_entry_or_null(&tmp, struct iomap_ioend,
> + io_list))) {
> + list_del_init(&ioend->io_list);
> + iomap_finish_ioend_buffered(ioend);
> + cond_resched();
> + }
> +}
> +
> +static DECLARE_WORK(failed_ioend_work, iomap_fail_ioends);
> +
> +static void iomap_fail_ioend_buffered(struct iomap_ioend *ioend)
> +{
> + unsigned long flags;
> +
> + /*
> + * Bounce I/O errors to a workqueue to avoid nested i_lock acquisitions
> + * in the fserror code. The caller no longer owns the ioend reference
> + * after the spinlock drops.
> + */
> + spin_lock_irqsave(&failed_ioend_lock, flags);
> + if (list_empty(&failed_ioend_list))
> + WARN_ON_ONCE(!schedule_work(&failed_ioend_work));
> + list_add_tail(&ioend->io_list, &failed_ioend_list);
> + spin_unlock_irqrestore(&failed_ioend_lock, flags);
> +}
> +
> static void ioend_writeback_end_bio(struct bio *bio)
> {
> struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
>
> ioend->io_error = blk_status_to_errno(bio->bi_status);
> + if (ioend->io_error) {
> + iomap_fail_ioend_buffered(ioend);
> + return;
> + }
> +
> iomap_finish_ioend_buffered(ioend);
> }
>
>
>
next prev parent reply other threads:[~2026-02-19 6:15 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-19 6:09 [PATCHSET] fs: bug fixes for 7.0 Darrick J. Wong
2026-02-19 6:09 ` [PATCH 1/2] fsnotify: drop unused helper Darrick J. Wong
2026-02-19 6:53 ` Christoph Hellwig
2026-02-19 11:37 ` Jan Kara
2026-02-19 6:09 ` [PATCH 2/2] fserror: fix lockdep complaint when igrabbing inode Darrick J. Wong
2026-02-19 6:15 ` Darrick J. Wong [this message]
2026-02-19 8:11 ` Christian Brauner
2026-02-20 0:55 ` Darrick J. Wong
2026-02-19 6:57 ` Christoph Hellwig
2026-02-19 22:52 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2026-02-20 1:00 [PATCHSET v2 2/2] fs: bug fixes for 7.0 Darrick J. Wong
2026-02-20 1:02 ` [PATCH 2/2] fserror: fix lockdep complaint when igrabbing inode Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260219061546.GP6467@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox