linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "John Stoffel" <john@stoffel.org>
To: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>, Zdenek Kabelac <zkabelac@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	dm-devel@redhat.com, Christoph Hellwig <hch@lst.de>,
	"Darrick J. Wong" <djwong@kernel.org>
Subject: Re: [PATCH] fix writing to the filesystem after unmount
Date: Fri, 8 Sep 2023 12:49:07 -0400	[thread overview]
Message-ID: <25851.20611.732252.455034@quad.stoffel.home> (raw)
In-Reply-To: <20230908-verflachen-neudefinition-4da649d673a9@brauner>

>>>>> "Christian" == Christian Brauner <brauner@kernel.org> writes:

>> Well, currently you click some "Eject / safely remove / whatever" button
>> and then you get a "wait" dialog until everything is done after which
>> you're told the stick is safe to remove. What I imagine is that the "wait"
>> dialog needs to be there while there are any (or exclusive at minimum) openers
>> of the device. Not until umount(2) syscall has returned. And yes, the

> Agreed. umount(2) doesn't give guarantees about a filesystem being
> really gone once it has returned. And it really shouldn't. There's
> too many cases where that doesn't work and it's not a commitment we
> should make.

So how the heck is someone supposed to know, from userspace, that a
filesystem is unmounted?  Just wearing my SysAdmin hat, this strikes
me as really potentially painful and annoying.  But then again, so are
bind mounts from alot of views too.  

Don't people remember how bad it can be when you are trying to
shutdown and system and it hangs because a remote NFS server is down
and not responding?  And your client system hangs completely?  

> And there are ways to wait until superblock shutdown that I've
> mentioned before in other places where it somehow really
> matters. inotify's IN_UMOUNT will notify about superblock
> shutdown. IOW, when it really hits generic_shutdown_super() which
> can only be hit after unfreezing as that requires active references.

Can we maybe invert this discussion and think about it from the
userspace side of things?  How does the user(space) mount/unmount
devices cleanly and reliably?  

> So this really can be used to wait for a filesystem to go away across
> all namespaces, and across filesytem freezing and it's available to
> unprivileged users. Simple example:

> # shell 1
> sudo mount -t xfs /dev/sda /mnt
> sudo mount --bind /mnt /opt
> inotifywait -e unmount /mnt

> #shell 2
> sudo umount /opt # nothing happens in shell 1
> sudo umount /mnt # shell 1 gets woken

So what makes this *useful* to anyone?  Why doesn't the bind mount
A) lock /mnt into place, but B) give you some way of seeing that
there's a bindmount in place?  

>> corner-cases. So does the current behavior, I agree, but improving
>> situation for one usecase while breaking another usecase isn't really a way
>> forward...

> Agreed.

>> Well, the filesystem (struct superblock to be exact) is invisible
>> in /proc/mounts (or whatever), that is true. But it is still very
>> much associated with that block device and if you do 'mount
>> <device> <mntpoint>', you'll get it back. But yes, the filesystem
>> will not go away

Then should it be unmountable in the first place?  I mean yes, we
always need a way to force an unmount no matter what, even if that
breaks some other process on the system, but for regular use,
shouldn't it just give back an error like:

	  /mnt in use by bind mount /opt

or something like that?  Give the poor sysadmin some information on
what's going on here. 

> And now we at least have an api to detect that case and refuse to reuse
> the superblock.

>> until all references to it are dropped and you cannot easily find
>> who holds those references and how to get rid of them.

ding ding ding!!!!  I don't want to have to run 'lsof' or something
like that.

> Namespaces make this even messier. You have no easy way of knowing
> whether the filesystem isn't pinned somewhere else through an
> explicit bind-mount or when it was copied during mount namespace
> creation.

This is the biggest downside of namespaces and bind mounts in my
mind.  The lack of traceability back to the source.  

  reply	other threads:[~2023-09-08 17:14 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-06 13:26 [PATCH] fix writing to the filesystem after unmount Mikulas Patocka
2023-09-06 14:27 ` Christian Brauner
2023-09-06 15:03   ` Mikulas Patocka
2023-09-06 15:33     ` Christian Brauner
2023-09-06 15:58       ` Christian Brauner
2023-09-06 16:01       ` Mikulas Patocka
2023-09-06 16:19         ` Christian Brauner
2023-09-06 16:52           ` Mikulas Patocka
2023-09-07  9:44             ` Jan Kara
2023-09-07 10:43               ` Christian Brauner
2023-09-07 12:04                 ` Mikulas Patocka
2023-09-08  7:32                   ` Jan Kara
2023-09-08  9:29                     ` Zdenek Kabelac
2023-09-08 10:20                       ` Jan Kara
     [not found]                         ` <15c62097-d58f-4e66-bdf5-e0edb1306b2f@redhat.com>
2023-09-08 11:32                           ` Christian Brauner
2023-09-08 12:07                             ` Zdenek Kabelac
2023-09-08 12:34                               ` Christian Brauner
2023-09-12  9:10                           ` Jan Kara
2023-09-08 12:02                         ` Christian Brauner
2023-09-08 16:49                           ` John Stoffel [this message]
2023-09-09 11:21                         ` Christoph Hellwig
2023-09-08 12:01                     ` Pavel Machek
2023-09-08 11:59               ` Pavel Machek
2023-09-06 17:10         ` Al Viro
2023-09-06 17:08     ` Al Viro
2023-09-06 15:22 ` Darrick J. Wong
2023-09-06 15:38   ` Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=25851.20611.732252.455034@quad.stoffel.home \
    --to=john@stoffel.org \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).