* Re: User-visible context-mount API
2018-01-17 9:53 ` User-visible context-mount API Miklos Szeredi
@ 2018-01-17 11:06 ` Karel Zak
2018-01-18 9:48 ` Miklos Szeredi
2018-01-19 2:27 ` Al Viro
2018-01-19 6:32 ` Al Viro
1 sibling, 2 replies; 5+ messages in thread
From: Karel Zak @ 2018-01-17 11:06 UTC (permalink / raw)
To: Miklos Szeredi
Cc: Al Viro, David Howells, Jeff Layton, Eric W. Biederman,
linux-fsdevel, Linux API, util-linux, Michael Kerrisk (man-pages)
On Wed, Jan 17, 2018 at 10:53:36AM +0100, Miklos Szeredi wrote:
> [Adding util-linux@vger and Michael Kerrisk]
>
> On Wed, Jan 17, 2018 at 5:17 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > On Tue, Jan 16, 2018 at 05:41:46PM +0100, Miklos Szeredi wrote:
> >
> >> Right.
> >>
> >> Still, those two (propagation and flags) are properties of the mount.
> >> No fundamental difference in how to handle them, that I see. Okay, we
> >> have MS_REC handling in the propagation and not in the flags, but
> >> that's something that might make sense for flags as well.
> >>
> >> What's more interesting is how MS_PRIVATE + MS_REC semantics are
> >> complete failure in the real world: the logical thing would be to mark
> >> a mount private on the supplied mount AND propagate an umount event to
> >> everywhere else.
> >
> > This is utter nonsense. Most of the time it's "Fedora, in its infinite
> > bogo^Wwisdom has made everything shared; I don't fucking need that
> > idiocy, so please unshare this, this and that". You really don't want
> > (or have permissions for) unmounting e.g. /mnt in namespace of init
> > when you do that.
> >
> > Sure, we get tons of bug reports. Due to idiotic Fedora setup, with
> > everything shared. The same setup that would go up in flames on the
> > semantics change you propose.
I guess "all shared" is systemd requirement, so I guess it's not
Fedora specific, right?
> I wouldn't propose to change existing --make-private, as this would
> not be backward compatible. The new semantics would mean a new op,
> obviously.
Definitely.
> Documenting --make-private thing properly would also help. To me the
> wording "make private" strongly implies "I want to make submounts
> private to this instance". See for example rhbz#1432211.
All propagation stuff is poorly documented in mount.8. It would be
nice to add section about it to the man page. Volunteer? (My skills to
explain this topic to end-users is pretty limited...)
> > If anything, "private bind on itself" would be a useful operation.
> > Turning given location into a mountpoint, and having everything
> > under it looking as it used to, but with no propagation at all.
> > Without bothering anybody else, even if location currently happens
> > to be on a shared/master mount.
Good idea.
> > I can slap that together for mount(2), but I'm not sure what a sane
> > combination of flags for that would look like ;-)
What about new flag (for the API) rather than try to be smart with the
current flags? But I have doubts that invest time to new mount(2)
features is a good idea.
> For fsmount I think it would be very useful thing to have.
Yes.
Karel
--
Karel Zak <kzak@redhat.com>
http://karelzak.blogspot.com
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: User-visible context-mount API
2018-01-17 9:53 ` User-visible context-mount API Miklos Szeredi
2018-01-17 11:06 ` Karel Zak
@ 2018-01-19 6:32 ` Al Viro
1 sibling, 0 replies; 5+ messages in thread
From: Al Viro @ 2018-01-19 6:32 UTC (permalink / raw)
To: Miklos Szeredi
Cc: David Howells, Jeff Layton, Eric W. Biederman, linux-fsdevel,
Linux API, util-linux, Michael Kerrisk (man-pages)
On Wed, Jan 17, 2018 at 10:53:36AM +0100, Miklos Szeredi wrote:
> Documenting --make-private thing properly would also help. To me the
> wording "make private" strongly implies "I want to make submounts
> private to this instance". See for example rhbz#1432211.
>
> > If anything, "private bind on itself" would be a useful operation.
> > Turning given location into a mountpoint, and having everything
> > under it looking as it used to, but with no propagation at all.
> > Without bothering anybody else, even if location currently happens
> > to be on a shared/master mount.
> >
> > I can slap that together for mount(2), but I'm not sure what a sane
> > combination of flags for that would look like ;-) For fsmount
> > I think it would be very useful thing to have.
>
> Yes, I think such an operation would be pretty useful. Not sure if
> it's the whole story, though.
FWIW, there's a fun variant of the API:
* fsopen(): string -> fsfd; takes fs type name, returns a file descriptor
connected to fs driver. Subsequent read/write on it is used to pass
options, authenticate, etc. - all you need to talk the driver into
creating an active instance.
* fspick(): location -> fsfd; fsfd connected to filesystem mounted at given
place. After that you can talk to the driver to get superblock-level
remount.
* new_mount(): fsfd x string -> fd. Creates a vfsmount and gives a file
descriptor for given relative pathname.
* clone_mount(): location x bool -> fd. Copies a vfsmount or an entire
subtree (depending upon the second parameter) and returns a file descriptor.
Basically, bind or rbind sans attaching it anywhere.
* change_flags(): fd x (propagation or vfsmount flags) x bool -> int
fd should point to root of some vfsmount (O_PATH, or either of the previous
two. Flag is "do we want it to affect the entire subtree"; the tricky
question is what to do with vfsmount flags - for those we might want
things like "here's the full set" or "change those flags thus".
Hell knows - there might be two primitives there; the second one
would be fd x mask x new_flags x bool -> int, as in "set the bits
present in mask to values as in new_flags". Not sure.
* move_mount(): fd x location x bool -> int. fd - what to move, location -
where to put it, bool - do we want to suppress propagation. Potentially
hacky part is that if fd is not attached to anything, we simply attach it;
otherwise - move.
Normal mount: fsopen, talk to driver, new_mount, move_mount, close descriptors
mount --bind: fd = clone_mount(old, false); move_mount(fd, new, false); close
mount --rbind: clone_mount(old, true); move_mount; close
mount --make-shared et.al.: open(..., O_PATH); change_flags; close
mount --move: open; mount_move; close
vfsmount-level remount: open; change_flags (or change_mount_flags, if we keep
it separate from topology ones); close
sb-level remount: fspick; talk to driver; close
make an arbitrary subtree really private (as discussed upthread):
fd = clone_mount(old, true); change_flags (or change_propagation_flags);
mount_move(fd, old, true); close(fd);
The tricky part in terms of implementation is that we want a
tree created by clone_mount() and never attached anywhere to be
dissolved on the final close() of the result of clone_mount().
It's not quite O_PATH - we want file_operations for that sucker
that would have ->release() doing that.
It would do namespace_lock(), check ->f_path.mnt for a flag and do
umount_tree() if not set, then namespace_unlock(). move_mount()
would set the flag.
^ permalink raw reply [flat|nested] 5+ messages in thread