linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: clean up xfs_mount allocation and dynamic initializers
Date: Mon, 26 Mar 2018 09:17:44 +1100	[thread overview]
Message-ID: <20180325221744.GI18129@dastard> (raw)
In-Reply-To: <20180324005203.GM4818@magnolia>

On Fri, Mar 23, 2018 at 05:52:03PM -0700, Darrick J. Wong wrote:
> On Thu, Mar 22, 2018 at 11:19:45AM +1100, Dave Chinner wrote:
> > On Wed, Mar 21, 2018 at 02:07:38PM -0400, Brian Foster wrote:
> > > On Wed, Mar 21, 2018 at 10:27:28AM -0700, Darrick J. Wong wrote:
> > > > On Wed, Mar 21, 2018 at 12:13:25PM -0400, Brian Foster wrote:
> > > > > Most of the generic data structures embedded in xfs_mount are
> > > > > dynamically initialized immediately after mp is allocated. A few
> > > > > fields are left out and initialized during the xfs_mountfs()
> > > > > sequence, after mp has been attached to the superblock.
> > > > > 
> > > > > To clean this up and help prevent premature access of associated
> > > > > fields, refactor xfs_mount allocation and all dependent init calls
> > > > > into a new helper. This self-documents that all low level data
> > > > > structures (i.e., locks, trees, etc.) should be initialized before
> > > > > xfs_mount is attached to the superblock.
> > > > > 
> > > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > > ---
> > > > > 
> > > > > I realized after the fact that this steps on the ability to instrument
> > > > > the use-before-init variant[1] of the problematic radix tree access on
> > > > > failed mount issue that Dave tracked down. This is because we'd
> > > > > immediately initialize the structure and thus a subsequent memset()
> > > > > would ultimately require another proper initialization.
> > > > 
> > > > That's simply a matter of re-calling
> > > > 
> > > > 	INIT_RADIX_TREE(&mp->m_perag_tree, GFP_ATOMIC);
> > > > 
> > > > after the msleep, right?
> > > > 
> > > 
> > > I suppose that may work (I'll let Dave chime in)...
> > 
> > Well, to tell the truth, this whole "VFS accesses the fs during
> > mount: problem is isolated to only the inode cache shrinker, and it
> > while a mount is in progress it does not run inode cache scans
> > (can't lock s_umount because the mount holds it locked).
> > 
> > IOWs, this crash can be fixed entirely by modifying the VFS code as
> > I mentioned to avoid super_cache_count() doing any work during mount
> > and unmount.
> > 
> > So, really, none of these XFS changes are necessary to fix the
> > problem that was reported. I'm not opposed to it as a cleanup, but
> > let's not conflate it with the VFS superblock shrinker bug that
> > needs to be fixed....
> 
> Are you working on such a patch?

Yes.

> I guessed that one could change the
> vfs to avoid register_shrinker until after fill_super returns?  Or
> possibly just have super_cache_count bail out if !SB_ACTIVE, though that
> could be messy given that xfs (and others) mess with that flag.

I think it's fine for the shrinker to check SB_ACTIVE, because if
it's not set then then inodes are not placed on thr LRUs that the
shrinker works on.  The problem is that the shrinker can run before
filesystems assert SB_ACTIVE, even though there will be nothing for
it to do and the filesystem has not set everything up. As such, I
think it's the right thing to do to match SB_ACTIVE checks from
iput_final() to super_cache_cache().

FWIW, if a filesystem is setting SB_ACTIVE earlier than the VFS,
then it better make sure it's already done all the setup it needs
to. The filesystem has to do that setup anyway (as the VFS requires
it), so it's just a matter of what order we do things in
->fill_super.

In XFS we have finished all the setup of the structures the shrinker
is dependent on by the time we set SB_ACTIVE in log recovery. Hence
I don't really see any problems with XFS or any of the other 
filesystems that set SB_ACTIVE in their mount routines.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-03-25 22:17 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21 16:13 [PATCH] xfs: clean up xfs_mount allocation and dynamic initializers Brian Foster
2018-03-21 17:27 ` Darrick J. Wong
2018-03-21 18:07   ` Brian Foster
2018-03-22  0:19     ` Dave Chinner
2018-03-24  0:52       ` Darrick J. Wong
2018-03-25 22:17         ` Dave Chinner [this message]
2018-03-24  0:54 ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180325221744.GI18129@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).