From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, chandan.babu@oracle.com
Subject: Re: [PATCH] xfs: don't use current->journal_info
Date: Fri, 23 Feb 2024 08:28:43 -0800 [thread overview]
Message-ID: <20240223162843.GL616564@frogsfrogsfrogs> (raw)
In-Reply-To: <ZdafCN+mNecltZ1T@dread.disaster.area>
On Thu, Feb 22, 2024 at 12:10:32PM +1100, Dave Chinner wrote:
> On Wed, Feb 21, 2024 at 03:25:36PM -0800, Darrick J. Wong wrote:
> > On Thu, Feb 22, 2024 at 09:47:23AM +1100, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > syzbot reported an ext4 panic during a page fault where found a
> > > journal handle when it didn't expect to find one. The structure
> > > it tripped over had a value of 'TRAN' in the first entry in the
> > > structure, and that indicates it tripped over a struct xfs_trans
> > > instead of a jbd2 handle.
> > >
> > > The reason for this is that the page fault was taken during a
> > > copy-out to a user buffer from an xfs bulkstat operation. XFS uses
> > > an "empty" transaction context for bulkstat to do automated metadata
> > > buffer cleanup, and so the transaction context is valid across the
> > > copyout of the bulkstat info into the user buffer.
> > >
> > > We are using empty transaction contexts like this in XFS in more
> > > places to reduce the risk of failing to release objects we reference
> > > during the operation, especially during error handling. Hence we
> > > really need to ensure that we can take page faults from these
> > > contexts without leaving landmines for the code processing the page
> > > fault to trip over.
> > >
> > > We really only use current->journal_info for a single warning check
> > > in xfs_vm_writepages() to ensure we aren't doing writeback from a
> > > transaction context. Writeback might need to do allocation, so it
> > > can need to run transactions itself. Hence it's a debug check to
> > > warn us that we've done something silly, and largely it is not all
> > > that useful.
> > >
> > > So let's just remove all the use of current->journal_info in XFS and
> > > get rid of all the potential issues from nested contexts where
> > > current->journal_info might get misused by another filesytsem
> > > context.
> >
> > I wonder if this is too much, though?
>
> We ran XFS for 15+ years without setting current->journal_info, so I
> don't see it as a necessary feature...
Indeed, I don't see it as a necessary feature for XFS, merely a "bs
coming in from other parts of the kernel" detector. And boy howdy did I
ever find bs.
> > Conceptually, I'd rather we set current->journal_info to some random
> > number whenever we start a !NO_WRITECOUNT (aka a non-empty) transaction
> > and clear it during transaction commit/cancel. That way, *we* can catch
> > the case where some other filesystem starts a transaction and
> > accidentally bounces into an updating write fault inside XFS.
>
> I could just leave the ASSERT(current->journal_info == NULL); in
> xfs_trans_set_context() and we would catch that case. But, really,
> having a page fault from some other filesystem bounce into XFS where
> we then have to run a transaction isn't a bug at all.
>
> The problem is purely that we now have two different contexts that
> now think they own current->journal_info. IOWs, no filesystem can
> allow page faults while current->journal_info is set by the
> filesystem because the page fault processing might use
> current->journal_info itself.
>
> If we end up with nested XFS transactions from a page fault whilst
> holding an empty transaction, then it isn't an issue as the outer
> transaction does not hold a log reservation. The only problem that
> might occur is a deadlock if the page fault tries to take the same
> locks the upper context holds, but that's not a problem that setting
> current->journal_info would solve, anyway...
This, however, is a much better justification (IMHO) for removing the
places where xfs touches journal_info. Would you mind adding that to
the commit message?
> Hence if XFS doesn't use current->journal_info, then we just don't
> care what context we are running the transaction in, above or below
> the fault.
>
> > That might be outweighed by the weird semantics of ext4 where the
> > zeroness of journal_info changes its behavior in ways I don't want to
> > understand. Thoughts?
>
> That's exactly the problem we're trying to avoid. Either we don't
> allow faults in (empty) transaction context, or we don't use
> current->journal_info. I prefer the latter as it gives us much more
> implementation freedom with empty transaction contexts..
Heh. I also found much much more bs and shenanigans of exactly the type
you'd expect from a global(ish) void pointer.
With the commit message amended, you can add:
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
next prev parent reply other threads:[~2024-02-23 16:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-21 22:47 [PATCH] xfs: don't use current->journal_info Dave Chinner
2024-02-21 23:25 ` Darrick J. Wong
2024-02-22 1:10 ` Dave Chinner
2024-02-23 16:28 ` Darrick J. Wong [this message]
2024-02-22 15:03 ` [External] : " mark.tinguely
2024-02-23 6:51 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240223162843.GL616564@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox