linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Amir Goldstein <amir73il@gmail.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Jeff Mahoney <jeffm@suse.com>, Theodore Tso <tytso@mit.edu>,
	Jan Kara <jack@suse.cz>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Subject: Re: [PATCH] xfs: avoid lockdep false positives in xfs_trans_alloc
Date: Thu, 4 Oct 2018 15:38:38 +1000	[thread overview]
Message-ID: <20181004053838.GT18567@dastard> (raw)
In-Reply-To: <CAJfpegsHwm=w8HhqEuYTBE1yFkhiTFB5Su1f2VrAB9XPhWrrbg@mail.gmail.com>

On Thu, Oct 04, 2018 at 01:14:12AM +0200, Miklos Szeredi wrote:
> On Thu, Oct 4, 2018 at 12:59 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Wed, Oct 03, 2018 at 06:45:13AM +0300, Amir Goldstein wrote:
> >> On Wed, Oct 3, 2018 at 2:14 AM Dave Chinner <david@fromorbit.com> wrote:
> >> [...]
> >> > > Seems like freezing any of the layers if overlay itself is not frozen
> >> > > is not a good idea.
> >> >
> >> > That's something we can't directly control. e.g. lower filesystem is
> >> > on a DM volume. DM can freeze the lower fileystem through the block
> >> > device when a dm command is run. It may well be that the admins that
> >> > set up the storage and filesystem layer have no idea that there are
> >> > now overlay users on top of the filesystem they originally set up.
> >> > Indeed, the admins may not even know that dm operations freeze
> >> > filesystems because it happens completely transparently to them.
> >> >
> >>
> >> I don't think we should be binding the stacked filesystem issues with
> >> the stacked block over fs issues.
> >
> > It's the same problem.  Hacking a one-off solution to hide a specific
> > overlay symptom does not address the root problem. And, besides, if
> > you stack like this:
> >
> > overlay
> >   lower_fs
> >     loopback dev
> >       loop img fs
> >
> > And freeze the loop img fs, overlay can still get stuck in it's
> > shrinker because the the lower_fs gets stuck doing IO on the frozen
> > loop img fs.
> >
> > i.e. it's the same issue - kswapd will get stuck doing reclaim from
> > the overlay shrinker.
> 
> Is overlay making the situation any worse in this case?  IOW, would

No, it's not. My point is that it's the same layering problem, not
that it is an issue unique to overlay. Fixing one without addressing
the other does not make the problem go away.

> >> If vfs stores a reverse tree of stacked fs dependencies, then individual
> >> sb freeze can be solved.
> >
> > Don't make me mention bind mounts... :/
> 
> How do mounts have anything to do with this?

Bind mounts layer superblocks in unpredictable manners. e.g. the
lowerfs for overlay could be a directory heirachy made up of
multiple bind mounts from different source filesystems. So it's not
just a 1:1 upper/lower superblock relationship it could be 1:many.
Dependency graphs get complx quickly in such configurations.

You can also have the same superblock above overlay through a bind
mount as well as below as the lower fs.  e.g. A user could freeze
the bind mounted heirarchy after checking it's a different device to
the filesystem they are also writing to (e.g. logs to "root" device,
freezes "home" device). IOWs, they could be completely unaware that
the freeze target is the same filesystem that underlies the "root"
device, and so when the freeze the "home" target their root device
also freezes.

That sort of thing registers high up on the user WTF-o-meter, and
it's not at all obvious what went wrong or how to get out of that
mess. It's not clear to me if bind mounts should accept path based
superblock admin commands like freeze because the potentially expose
problems far outside the realm that specific user/mount namespace is
allowed access to.

My point is that it's not obvious that there is a simple, clear
dependency hierarchy between superblocks because there are so many
ways they can be layered by userspace and layered fileystems and
block devices.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-10-04 12:30 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-07  1:51 [PATCH] xfs: avoid lockdep false positives in xfs_trans_alloc Dave Chinner
2018-09-07 14:07 ` Brian Foster
2018-09-10  6:47 ` Christoph Hellwig
2018-09-30  7:56 ` Amir Goldstein
2018-10-01  1:09   ` Dave Chinner
2018-10-01  7:56     ` Amir Goldstein
2018-10-01 22:32       ` Dave Chinner
2018-10-02  4:02         ` Amir Goldstein
2018-10-02  6:39           ` Dave Chinner
2018-10-02  7:33             ` Miklos Szeredi
2018-10-02 23:14               ` Dave Chinner
2018-10-03  3:45                 ` Amir Goldstein
2018-10-03 22:59                   ` Dave Chinner
2018-10-03 23:14                     ` Miklos Szeredi
2018-10-04  5:38                       ` Dave Chinner [this message]
2018-10-04  7:33                         ` Miklos Szeredi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181004053838.GT18567@dastard \
    --to=david@fromorbit.com \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=jeffm@suse.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).