public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: cem@kernel.org
Cc: hch@infradead.org, brauner@kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 2/2] fserror: fix lockdep complaint when igrabbing inode
Date: Wed, 18 Feb 2026 22:15:46 -0800	[thread overview]
Message-ID: <20260219061546.GP6467@frogsfrogsfrogs> (raw)
In-Reply-To: <177148129564.716249.3069780698231701540.stgit@frogsfrogsfrogs>

On Wed, Feb 18, 2026 at 10:09:37PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Christoph Hellwig reported a lockdep splat in generic/108:
> 
>  ================================
>  WARNING: inconsistent lock state
>  6.19.0+ #4827 Tainted: G                 N
>  --------------------------------
>  inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
>  swapper/1/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>  ffff88811ed1b140 (&sb->s_type->i_lock_key#33){?.+.}-{3:3}, at: igrab+0x1a/0xb0
>  {HARDIRQ-ON-W} state was registered at:
>    lock_acquire+0xca/0x2c0
>    _raw_spin_lock+0x2e/0x40
>    unlock_new_inode+0x2c/0xc0
>    xfs_iget+0xcf4/0x1080
>    xfs_trans_metafile_iget+0x3d/0x100
>    xfs_metafile_iget+0x2b/0x50
>    xfs_mount_setup_metadir+0x20/0x60
>    xfs_mountfs+0x457/0xa60
>    xfs_fs_fill_super+0x6b3/0xa90
>    get_tree_bdev_flags+0x13c/0x1e0
>    vfs_get_tree+0x27/0xe0
>    vfs_cmd_create+0x54/0xe0
>    __do_sys_fsconfig+0x309/0x620
>    do_syscall_64+0x8b/0xf80
>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>  irq event stamp: 139080
>  hardirqs last  enabled at (139079): [<ffffffff813a923c>] do_idle+0x1ec/0x270
>  hardirqs last disabled at (139080): [<ffffffff828a8d09>] common_interrupt+0x19/0xe0
>  softirqs last  enabled at (139032): [<ffffffff8134a853>] __irq_exit_rcu+0xc3/0x120
>  softirqs last disabled at (139025): [<ffffffff8134a853>] __irq_exit_rcu+0xc3/0x120
> 
>  other info that might help us debug this:
>   Possible unsafe locking scenario:
> 
>         CPU0
>         ----
>    lock(&sb->s_type->i_lock_key#33);
>    <Interrupt>
>      lock(&sb->s_type->i_lock_key#33);
> 
>   *** DEADLOCK ***
> 
>  1 lock held by swapper/1/0:
>   #0: ffff8881052c81a0 (&vblk->vqs[i].lock){-.-.}-{3:3}, at: virtblk_done+0x4b/0x110
> 
>  stack backtrace:
>  CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G                 N  6.19.0+ #4827 PREEMPT(full)
>  Tainted: [N]=TEST
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
>  Call Trace:
>   <IRQ>
>   dump_stack_lvl+0x5b/0x80
>   print_usage_bug.part.0+0x22c/0x2c0
>   mark_lock+0xa6f/0xe90
>   __lock_acquire+0x10b6/0x25e0
>   lock_acquire+0xca/0x2c0
>   _raw_spin_lock+0x2e/0x40
>   igrab+0x1a/0xb0
>   fserror_report+0x135/0x260
>   iomap_finish_ioend_buffered+0x170/0x210
>   clone_endio+0x8f/0x1c0
>   blk_update_request+0x1e4/0x4d0
>   blk_mq_end_request+0x1b/0x100
>   virtblk_done+0x6f/0x110
>   vring_interrupt+0x59/0x80
>   __handle_irq_event_percpu+0x8a/0x2e0
>   handle_irq_event+0x33/0x70
>   handle_edge_irq+0xdd/0x1e0
>   __common_interrupt+0x6f/0x180
>   common_interrupt+0xb7/0xe0
>   </IRQ>
> 
> It looks like the concern here is that inode::i_lock is sometimes taken
> in IRQ context, and sometimes it is held when going to IRQ context,
> though it's a little difficult to tell since I think this is a kernel
> from after the actual 6.19 release but before 7.0-rc1.
> 
> Either way, we don't need to take i_lock, because filesystems should
> not report files to fserror if they're about to be freed or have not
> yet been exposed to other threads, because the resulting fsnotify report
> will be meaningless.
> 
> Therefore, bump inode::i_count directly and clarify the preconditions on
> the inode being passed in.

...and now I realize that I got so hung up on email cc list composition
that I neglected to notice that I forgot to update the commit message
to say:

"Therefore, add the ioend to a queue and get an async worker to chug
through the error events from process context with no filesystem locks
already held."

Let's hope I got the paperwork right this time, all this friction to
amend minor mistakes are why I don't want to be here anymore. <grumble>

--D

> Link: https://lore.kernel.org/linux-fsdevel/aY7BndIgQg3ci_6s@infradead.org/
> Reported-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
> ---
>  fs/iomap/ioend.c |   46 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 46 insertions(+)
> 
> 
> diff --git a/fs/iomap/ioend.c b/fs/iomap/ioend.c
> index e4d57cb969f1bb..4d1ef8a2cee90b 100644
> --- a/fs/iomap/ioend.c
> +++ b/fs/iomap/ioend.c
> @@ -69,11 +69,57 @@ static u32 iomap_finish_ioend_buffered(struct iomap_ioend *ioend)
>  	return folio_count;
>  }
>  
> +static DEFINE_SPINLOCK(failed_ioend_lock);
> +static LIST_HEAD(failed_ioend_list);
> +
> +static void
> +iomap_fail_ioends(
> +	struct work_struct	*work)
> +{
> +	struct iomap_ioend	*ioend;
> +	struct list_head	tmp;
> +	unsigned long		flags;
> +
> +	spin_lock_irqsave(&failed_ioend_lock, flags);
> +	list_replace_init(&failed_ioend_list, &tmp);
> +	spin_unlock_irqrestore(&failed_ioend_lock, flags);
> +
> +	while ((ioend = list_first_entry_or_null(&tmp, struct iomap_ioend,
> +			io_list))) {
> +		list_del_init(&ioend->io_list);
> +		iomap_finish_ioend_buffered(ioend);
> +		cond_resched();
> +	}
> +}
> +
> +static DECLARE_WORK(failed_ioend_work, iomap_fail_ioends);
> +
> +static void iomap_fail_ioend_buffered(struct iomap_ioend *ioend)
> +{
> +	unsigned long flags;
> +
> +	/*
> +	 * Bounce I/O errors to a workqueue to avoid nested i_lock acquisitions
> +	 * in the fserror code.  The caller no longer owns the ioend reference
> +	 * after the spinlock drops.
> +	 */
> +	spin_lock_irqsave(&failed_ioend_lock, flags);
> +	if (list_empty(&failed_ioend_list))
> +		WARN_ON_ONCE(!schedule_work(&failed_ioend_work));
> +	list_add_tail(&ioend->io_list, &failed_ioend_list);
> +	spin_unlock_irqrestore(&failed_ioend_lock, flags);
> +}
> +
>  static void ioend_writeback_end_bio(struct bio *bio)
>  {
>  	struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
>  
>  	ioend->io_error = blk_status_to_errno(bio->bi_status);
> +	if (ioend->io_error) {
> +		iomap_fail_ioend_buffered(ioend);
> +		return;
> +	}
> +
>  	iomap_finish_ioend_buffered(ioend);
>  }
>  
> 
> 

  reply	other threads:[~2026-02-19  6:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19  6:09 [PATCHSET] fs: bug fixes for 7.0 Darrick J. Wong
2026-02-19  6:09 ` [PATCH 1/2] fsnotify: drop unused helper Darrick J. Wong
2026-02-19  6:53   ` Christoph Hellwig
2026-02-19 11:37   ` Jan Kara
2026-02-19  6:09 ` [PATCH 2/2] fserror: fix lockdep complaint when igrabbing inode Darrick J. Wong
2026-02-19  6:15   ` Darrick J. Wong [this message]
2026-02-19  8:11     ` Christian Brauner
2026-02-20  0:55       ` Darrick J. Wong
2026-02-19  6:57   ` Christoph Hellwig
2026-02-19 22:52     ` Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2026-02-20  1:00 [PATCHSET v2 2/2] fs: bug fixes for 7.0 Darrick J. Wong
2026-02-20  1:02 ` [PATCH 2/2] fserror: fix lockdep complaint when igrabbing inode Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260219061546.GP6467@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox