From: "Darrick J. Wong" <djwong@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>,
cem@kernel.org, hch@lst.de, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org, gabriel@krisman.be,
Christian Brauner <brauner@kernel.org>
Subject: Re: [PATCH 1/6] iomap: report file IO errors to fsnotify
Date: Thu, 6 Nov 2025 09:06:18 -0800 [thread overview]
Message-ID: <20251106170618.GR196362@frogsfrogsfrogs> (raw)
In-Reply-To: <flgcbd3bzaa6sdlnspub7htwyengcuuadbcws32edns7z5fpgr@kyjoo4tf7qyf>
On Thu, Nov 06, 2025 at 11:13:45AM +0100, Jan Kara wrote:
> On Wed 05-11-25 11:41:38, Darrick J. Wong wrote:
> > On Wed, Nov 05, 2025 at 10:28:08AM -0800, Darrick J. Wong wrote:
> > > On Wed, Nov 05, 2025 at 03:24:41PM +0100, Jan Kara wrote:
> > > > On Wed 05-11-25 12:14:52, Amir Goldstein wrote:
> > > > >
> > > > > ...
> > > > > We recently discovered that fsnotify_sb_error() calls are exposed to
> > > > > races with generic_shutdown_super():
> > > > > https://lore.kernel.org/linux-fsdevel/scmyycf2trich22v25s6gpe3ib6ejawflwf76znxg7sedqablp@ejfycd34xvpa/
> > >
> > > Hrmm. I've noticed that ever since I added this new patchset, I've been
> > > getting more instances of outright crashes in the timer code, or
> > > workqueue lockups. I wonder if that UAF is what's going on here...
> > >
> > > > > Will punting all FS_ERROR events to workqueue help to improve this
> > > > > situation or will it make it worse?
> > > >
> > > > Worse. But you raise a really good point which I've missed during my
> > > > review. Currently there's nothing which synchronizes pending works with
> > > > superblock getting destroyed with obvious UAF issues already in
> > > > handle_sb_error().
> > >
> > > I wonder, could __sb_error call get_active_super() to obtain an active
> > > reference to the sb, and then deactivate_super() it in the workqueue
> > > callback? If we can't get an active ref then we presume that the fs is
> > > already shutting down and don't send the event.
> >
> > ...and now that I've actually tried it, I realize that we can't actually
> > call get_active_super because it can sleep waiting for s_umount and
> > SB_BORN. Maybe we could directly atomic_inc_not_zero(&sb->s_active)
> > adn trust that the caller has an active ref to the sb? I think that's
> > true for anyone calling __sb_error with a non-null inode.
>
> Well, the side-effects of holding active sb reference from some workqueue
> item tend to hit back occasionally (like when userspace assumes the device
> isn't used anymore but it in fact still is because of the active
> reference). Every time we tried something like this (last time it was with
> iouring I believe) some user came back and complained his setup broke. In
> this case it should be really rare but still I think it's better to avoid
> it if we can (plus I'm not sure what you'd like to do for __sb_error()
> callers that don't get the inode and thus active reference isn't really
> guaranteed - they still need the protection against umount so that
> handle_sb_error() can do the notifier callchain thing).
>
> So I think a better solution might be that generic_shutdown_super() waits
> for pending error notifications after clearing SB_ACTIVE before umount
> proceeds further and __sb_error() just starts discarding new notifications
> as soon as we see SB_ACTIVE is clear.
<nod> Summarizing what we just talked about on the ext4 call--
Instead of grabbing an active reference to the sb, I'll instead fix
__sb_error to (a) ignore !BORN || !ACTIVE || DYING supers, and (b)
increment a counter in the sb whenever we queue an event, and decrement
it when the worker finishes with it. generic_shutdown_super can then
wait (having just cleared ACTIVE) for the counter to hit zero before it
stops fsnotify and drops the dentry cache.
Or at least that's what I'll try today.
--D
> Honza
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>
next prev parent reply other threads:[~2025-11-06 17:06 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-05 0:46 [PATCHBOMB v2 6.19] xfs: autonomous self healing Darrick J. Wong
2025-11-05 0:48 ` [PATCHSET V3 1/2] xfs: autonomous self healing of filesystems Darrick J. Wong
2025-11-05 0:48 ` [PATCH 01/22] docs: remove obsolete links in the xfs online repair documentation Darrick J. Wong
2025-11-26 6:43 ` Christoph Hellwig
2025-11-27 15:09 ` Carlos Maiolino
2025-11-05 0:48 ` [PATCH 02/22] docs: discuss autonomous self healing in the xfs online repair design doc Darrick J. Wong
2025-11-05 0:49 ` [PATCH 03/22] xfs: create debugfs uuid aliases Darrick J. Wong
2025-11-05 0:49 ` [PATCH 04/22] xfs: create hooks for monitoring health updates Darrick J. Wong
2025-11-05 0:49 ` [PATCH 05/22] xfs: create a filesystem shutdown hook Darrick J. Wong
2025-11-05 0:49 ` [PATCH 06/22] xfs: create hooks for media errors Darrick J. Wong
2025-11-05 0:50 ` [PATCH 07/22] iomap: report buffered read and write io errors to the filesystem Darrick J. Wong
2025-11-05 0:50 ` [PATCH 08/22] iomap: report directio read and write errors to callers Darrick J. Wong
2025-11-05 0:50 ` [PATCH 09/22] xfs: create file io error hooks Darrick J. Wong
2025-11-05 0:51 ` [PATCH 10/22] xfs: create a special file to pass filesystem health to userspace Darrick J. Wong
2025-11-05 0:51 ` [PATCH 11/22] xfs: create event queuing, formatting, and discovery infrastructure Darrick J. Wong
2025-11-05 0:51 ` [PATCH 12/22] xfs: report metadata health events through healthmon Darrick J. Wong
2025-11-05 0:51 ` [PATCH 13/22] xfs: report shutdown " Darrick J. Wong
2025-11-05 0:52 ` [PATCH 14/22] xfs: report media errors " Darrick J. Wong
2025-11-05 0:52 ` [PATCH 15/22] xfs: report file io " Darrick J. Wong
2025-11-05 0:52 ` [PATCH 16/22] xfs: allow reconfiguration of the health monitoring device Darrick J. Wong
2025-11-05 0:52 ` [PATCH 17/22] xfs: validate fds against running healthmon Darrick J. Wong
2025-11-05 0:53 ` [PATCH 18/22] xfs: add media error reporting ioctl Darrick J. Wong
2025-11-05 0:53 ` [PATCH 19/22] xfs: send uevents when major filesystem events happen Darrick J. Wong
2025-11-05 0:53 ` [PATCH 20/22] xfs: merge health monitoring events when possible Darrick J. Wong
2025-11-05 0:53 ` [PATCH 21/22] xfs: restrict healthmon users further Darrick J. Wong
2025-11-05 0:54 ` [PATCH 22/22] xfs: charge healthmon event objects to the memcg of the listening process Darrick J. Wong
2025-11-05 0:48 ` [PATCHSET V3 2/2] iomap: generic file IO error reporting Darrick J. Wong
2025-11-05 0:54 ` [PATCH 1/6] iomap: report file IO errors to fsnotify Darrick J. Wong
2025-11-05 11:00 ` Jan Kara
2025-11-05 11:14 ` Amir Goldstein
2025-11-05 14:24 ` Jan Kara
2025-11-05 18:28 ` Darrick J. Wong
2025-11-05 19:41 ` Darrick J. Wong
2025-11-06 10:13 ` Jan Kara
2025-11-06 17:06 ` Darrick J. Wong [this message]
2025-11-05 0:54 ` [PATCH 2/6] xfs: switch healthmon to use the iomap I/O error reporting Darrick J. Wong
2025-11-05 0:54 ` [PATCH 3/6] xfs: port notify-failure to use the new vfs io " Darrick J. Wong
2025-11-05 0:55 ` [PATCH 4/6] xfs: remove file I/O error hooks Darrick J. Wong
2025-11-05 0:55 ` [PATCH 5/6] iomap: remove " Darrick J. Wong
2025-11-05 0:55 ` [PATCH 6/6] xfs: report fs metadata errors via fsnotify Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251106170618.GR196362@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=amir73il@gmail.com \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=gabriel@krisman.be \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).