linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/3] xfs: teach deferred op freezer to freeze and thaw inodes
Date: Wed, 29 Apr 2020 07:38:03 -0400	[thread overview]
Message-ID: <20200429113803.GA33986@bfoster> (raw)
In-Reply-To: <20200428221747.GH6742@magnolia>

On Tue, Apr 28, 2020 at 03:17:47PM -0700, Darrick J. Wong wrote:
> On Mon, Apr 27, 2020 at 07:37:52AM -0400, Brian Foster wrote:
> > On Sat, Apr 25, 2020 at 12:01:37PM -0700, Christoph Hellwig wrote:
> > > On Tue, Apr 21, 2020 at 07:08:26PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Make it so that the deferred operations freezer can save inode numbers
> > > > when we freeze the dfops chain, and turn them into pointers to incore
> > > > inodes when we thaw the dfops chain to finish them.  Next, add dfops
> > > > item freeze and thaw functions to the BUI/BUD items so that they can
> > > > take advantage of this new feature.  This fixes a UAF bug in the
> > > > deferred bunmapi code because xfs_bui_recover can schedule another BUI
> > > > to continue unmapping but drops the inode pointer immediately
> > > > afterwards.
> > > 
> > > I'm only looking over this the first time, but why can't we just keep
> > > inode reference around during reocvery instead of this fairly
> > > complicated scheme to save the ino and then look it up again?
> > > 
> > 
> > I'm also a little confused about the use after free in the first place.
> > Doesn't xfs_bui_recover() look up the inode itself, or is the issue that
> > xfs_bui_recover() is fine but we might get into
> > xfs_bmap_update_finish_item() sometime later on the same inode without
> > any reference?
> 
> The second.  In practice it doesn't seem to trigger on the existing
> code, but the combination of atomic extent swap + fsstress + shutdown
> testing was enough to push it over the edge once due to reclaim.
> 
> > If the latter, similarly to Christoph I wonder if we
> > really could/should grab a reference on the inode for the intent itself,
> > even though that might not be necessary outside of recovery.
> 
> Outside of recovery we don't have the UAF problem because there's always
> something (usually the VFS dentry cache, but sometimes an explicit iget)
> that hold a reference to the inode for the duration of the transaction
> and dfops processing.
> 

Right, that's what I figured.

> One could just hang on to all incore inodes until the end of recovery
> like Christoph says, but the downside of doing it that way is that now
> we require enough memory to maintain all that incore state vs. only
> needing enough for the incore inodes involved in a particular dfops
> chain.  That isn't a huge deal now, but I was looking ahead to atomic
> extent swaps.
> 

What I was thinking above was tying the reference to the lifetime of the
intents associated with the inode, not necessarily the full lifetime of
recovery. It's not immediately clear to me if that indirectly leads to a
similar chain of in-core inodes due to unusual ordering of dfops chains
during recovery; ISTM that would mean a deviation from the typical
runtime dfops ordering, but perhaps I'm missing something...

That aside, based on your description above it seems we currently rely
on this icache retention behavior for recovery anyways, otherwise we'd
hit this use after free and probably have user reports. That suggests to
me that holding a reference is a logical next step, at least as a bug
fix patch to provide a more practical solution for stable/distro
kernels. For example, if we just associated an iget()/iput() with the
assignment of the xfs_bmap_intent->bi_owner field (and the eventual free
of the intent structure), would that technically solve the inode use
after free problem?

BTW, I also wonder about the viability of changing ->bi_owner to an
xfs_ino_t instead of a direct pointer, but that might be more
involved than just adding a reference to the existing scheme...

Brian

> (And, yeah, I should put that series on the list now...)
> 
> > Either way, more details about the problem being fixed in the commit log
> > would be helpful.
> 
> <nod>
> 
> --D
> 
> > Brian
> > 
> 


  reply	other threads:[~2020-04-29 11:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22  2:08 [PATCH 0/3] xfs: fix inode use-after-free during log recovery Darrick J. Wong
2020-04-22  2:08 ` [PATCH 1/3] xfs: proper replay of deferred ops queued " Darrick J. Wong
2020-04-24 14:02   ` Brian Foster
2020-04-28 22:28     ` Darrick J. Wong
2020-04-22  2:08 ` [PATCH 2/3] xfs: reduce log recovery transaction block reservations Darrick J. Wong
2020-04-24 14:04   ` Brian Foster
2020-04-28 22:22     ` Darrick J. Wong
2020-05-27 22:39       ` Darrick J. Wong
2020-04-22  2:08 ` [PATCH 3/3] xfs: teach deferred op freezer to freeze and thaw inodes Darrick J. Wong
2020-04-25 19:01   ` Christoph Hellwig
2020-04-27 11:37     ` Brian Foster
2020-04-28 22:17       ` Darrick J. Wong
2020-04-29 11:38         ` Brian Foster [this message]
2020-04-29 11:48           ` Christoph Hellwig
2020-04-29 14:28             ` Darrick J. Wong
2020-04-29 14:55               ` Christoph Hellwig
2020-04-29 23:58                 ` Darrick J. Wong
2020-05-01 17:09                   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200429113803.GA33986@bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@infradead.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).