linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Remount root RO after the root dentry drops from the namespace
@ 2009-11-18 11:12 Daniel Drake
  2009-11-18 11:50 ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Drake @ 2009-11-18 11:12 UTC (permalink / raw)
  To: linux-fsdevel

Hi,

OLPC ships a slightly strange filesystem layout. We keep everything
under /versions/run/<hash>. For example, libc is
at /versions/run/<hash>/lib/libc.so.6

The background behind this is that we can then create a mass of
hardlinks to that original OS, and do a safe OS update by downloading
files that have changed into that 2nd tree, breaking the hardlinks as we
go. At the end we can atomically update the symlink that points to
the /versions/run/<hash> which determines which OS is booted into on
reboot. Therefore we have atomic OS updates which can be interrupted
without consequences.

Once the system is booted, this weird layout is barely visible, because
the initramfs performs a series of chroot and mount --move steps in
order to make things function as normal. Assuming that the initramfs
mounts the root partition at /sysroot in its namespace and the hash
we're booting into is just "1", then this is what happens:

1. cd /sysroot
2. mount --move . /
3. chroot .
4. cd /
5. chdir /versions/run/1
6. chroot .
7. cd /
8. exec /sbin/init

This works well and the system functions as normal. However, I've
noticed that during shutdown, the root filesystem is never unmounted
cleanly.

"mount -o remount,ro /" always fails due to this check in do_remount():
	if (path->dentry != path->mnt->mnt_root) {
		return -EINVAL;

Obviously the dentry for the / path that we are trying to unmount is not
the actual root of the mount. However the root of the mount is long
gone, so I'm not sure what we can do. And shutting down cleanly is
obviously important!

Any thoughts?

Thanks!
Daniel



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 11:12 Remount root RO after the root dentry drops from the namespace Daniel Drake
@ 2009-11-18 11:50 ` Miklos Szeredi
  2009-11-18 12:38   ` Daniel Drake
  0 siblings, 1 reply; 9+ messages in thread
From: Miklos Szeredi @ 2009-11-18 11:50 UTC (permalink / raw)
  To: Daniel Drake; +Cc: linux-fsdevel

On Wed, 18 Nov 2009, Daniel Drake wrote:
> 1. cd /sysroot
> 2. mount --move . /
> 3. chroot .
> 4. cd /
> 5. chdir /versions/run/1
> 6. chroot .
> 7. cd /
> 8. exec /sbin/init
> 
> This works well and the system functions as normal. However, I've
> noticed that during shutdown, the root filesystem is never unmounted
> cleanly.
> 
> "mount -o remount,ro /" always fails due to this check in do_remount():
> 	if (path->dentry != path->mnt->mnt_root) {
> 		return -EINVAL;
> 
> Obviously the dentry for the / path that we are trying to unmount is not
> the actual root of the mount. However the root of the mount is long
> gone, so I'm not sure what we can do. And shutting down cleanly is
> obviously important!

You can easily make a directory a root of a mount with

 mount --bind $DIR $DIR

In your example, add this before 5.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 11:50 ` Miklos Szeredi
@ 2009-11-18 12:38   ` Daniel Drake
  2009-11-18 12:58     ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Drake @ 2009-11-18 12:38 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Wed, 2009-11-18 at 12:50 +0100, Miklos Szeredi wrote:
> You can easily make a directory a root of a mount with
> 
>  mount --bind $DIR $DIR
> 
> In your example, add this before 5.

Thanks for the suggestion! And following on from that, we would do
"mount -o remount,ro /path/to/realroot" during shutdown
where /path/to/realroot is the bind mount that we created based on your
advice?

Unfortunately, that approach doesn't solve the problem. If you remount a
bind-mount as read only then the "real" underlying mount is unaffected.
See the code flow in do_remount() :

	if (flags & MS_BIND)
		err = change_mount_flags(path->mnt, flags);
	else
		err = do_remount_sb(sb, flags, data, 0);

Any other ideas?

Thanks,
Daniel



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 12:38   ` Daniel Drake
@ 2009-11-18 12:58     ` Miklos Szeredi
  2009-11-18 14:52       ` Karel Zak
  2009-11-18 15:18       ` Daniel Drake
  0 siblings, 2 replies; 9+ messages in thread
From: Miklos Szeredi @ 2009-11-18 12:58 UTC (permalink / raw)
  To: Daniel Drake; +Cc: miklos, linux-fsdevel

On Wed, 18 Nov 2009, Daniel Drake wrote:
> On Wed, 2009-11-18 at 12:50 +0100, Miklos Szeredi wrote:
> > You can easily make a directory a root of a mount with
> > 
> >  mount --bind $DIR $DIR
> > 
> > In your example, add this before 5.
> 
> Thanks for the suggestion! And following on from that, we would do
> "mount -o remount,ro /path/to/realroot" during shutdown
> where /path/to/realroot is the bind mount that we created based on your
> advice?

I see some confusion here, there's no "path/to/realroot", the bind
mount didn't actually do anything to the namespace, it just made
"/versions/run/1" (which will later become "/") a mountpoint.

At shutdown there's nothing special to do, other than 
"mount -o remount,ro /"

> 
> Unfortunately, that approach doesn't solve the problem. If you remount a
> bind-mount as read only then the "real" underlying mount is unaffected.
> See the code flow in do_remount() :
> 
> 	if (flags & MS_BIND)
> 		err = change_mount_flags(path->mnt, flags);
> 	else
> 		err = do_remount_sb(sb, flags, data, 0);
> 

Note: the "flags" tested here are supplied from the argument of
mount(2), and are not flags stored in the mount.  And btw, MS_BIND is
not stored in the mount at all, there's absolutely no difference
between a mount created with "--bind" and one without.

So my suggestion should work fine.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 12:58     ` Miklos Szeredi
@ 2009-11-18 14:52       ` Karel Zak
  2009-11-18 15:28         ` Miklos Szeredi
  2009-11-18 15:18       ` Daniel Drake
  1 sibling, 1 reply; 9+ messages in thread
From: Karel Zak @ 2009-11-18 14:52 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Daniel Drake, linux-fsdevel

On Wed, Nov 18, 2009 at 01:58:31PM +0100, Miklos Szeredi wrote:
> On Wed, 18 Nov 2009, Daniel Drake wrote:
> > 
> > 	if (flags & MS_BIND)
> > 		err = change_mount_flags(path->mnt, flags);
> > 	else
> > 		err = do_remount_sb(sb, flags, data, 0);
> > 
> 
> Note: the "flags" tested here are supplied from the argument of
> mount(2), and are not flags stored in the mount.  And btw, MS_BIND is
> not stored in the mount at all, there's absolutely no difference
> between a mount created with "--bind" and one without.

 The "flags" argument of mount(2) always contains MS_BIND for bind
 mounts (if you use standard mount(8) command and /etc/mtab) for -o
 remount.
 
    Karel

-- 
 Karel Zak  <kzak@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 12:58     ` Miklos Szeredi
  2009-11-18 14:52       ` Karel Zak
@ 2009-11-18 15:18       ` Daniel Drake
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Drake @ 2009-11-18 15:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Wed, 2009-11-18 at 13:58 +0100, Miklos Szeredi wrote:
> I see some confusion here, there's no "path/to/realroot", the bind
> mount didn't actually do anything to the namespace, it just made
> "/versions/run/1" (which will later become "/") a mountpoint.

Ah yes, I misunderstood.

This bind-mounting trick actually allows for a simplification to the
whole procedure I listed earlier. Everything is working now.

We previously added support to util-linux-ng's switch_root utility to
execute that procedure for the case when the target root is not a mount
point. I'll submit a patch to revert that and instead note this trick in
the manpage.

Thanks for your help!
Daniel



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 14:52       ` Karel Zak
@ 2009-11-18 15:28         ` Miklos Szeredi
  2009-11-18 17:08           ` Karel Zak
  0 siblings, 1 reply; 9+ messages in thread
From: Miklos Szeredi @ 2009-11-18 15:28 UTC (permalink / raw)
  To: Karel Zak; +Cc: miklos, dsd, linux-fsdevel

On Wed, 18 Nov 2009, Karel Zak wrote:
>  The "flags" argument of mount(2) always contains MS_BIND for bind
>  mounts (if you use standard mount(8) command and /etc/mtab) for -o
>  remount.

I think that's a bug.  I can see the reasoning behind it, but it's
wrong.  mount(8) should not guess whether the user wants to change the
per superblock flags or the per mount flags based on *how* the mount
was created.

Look:

  mount /dev/hda1 /mnt1; mount /dev/hda1 /mnt2

is completely equivalent to

  mount /dev/hda1 /mnt1; mount --bind /mnt1 /mnt2

Why should "mount -oremount" treat one differently than the other?

It would be much cleaner to have a mount option, say
--change-mnt-flags, that is equivalent to "--bind -oremount".

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 15:28         ` Miklos Szeredi
@ 2009-11-18 17:08           ` Karel Zak
  2009-11-18 17:38             ` Miklos Szeredi
  0 siblings, 1 reply; 9+ messages in thread
From: Karel Zak @ 2009-11-18 17:08 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: dsd, linux-fsdevel

On Wed, Nov 18, 2009 at 04:28:23PM +0100, Miklos Szeredi wrote:
> On Wed, 18 Nov 2009, Karel Zak wrote:
> >  The "flags" argument of mount(2) always contains MS_BIND for bind
> >  mounts (if you use standard mount(8) command and /etc/mtab) for -o
> >  remount.
> 
> I think that's a bug.  I can see the reasoning behind it, but it's

 Don't blame userspace :-) see kernel patch:

 commit 2e4b7fcd926006531935a4c79a5e9349fe51125b
 Author: Dave Hansen <haveblue@us.ibm.com>
 Date:   Fri Feb 15 14:38:00 2008 -0800

 [PATCH] r/o bind mounts: honor mount writer counts at remount

> wrong.  mount(8) should not guess whether the user wants to change the
> per superblock flags or the per mount flags based on *how* the mount
> was created.
> 
> Look:
> 
>   mount /dev/hda1 /mnt1; mount /dev/hda1 /mnt2
> 
> is completely equivalent to
> 
>   mount /dev/hda1 /mnt1; mount --bind /mnt1 /mnt2
> 
> Why should "mount -oremount" treat one differently than the other?

 It's kernel who cares about M_REMOUNT & MS_BIND. And kernel
 MS_REMOUNT behavior was silently changed one year ago.
 
 The mount(8) command has exactly defined way how works with mount
 options for many years. There is not any exception for MS_BIND. It
 reads options from fstab/mtab and apply options from command line,
 the result is send to kernel.

 I think the current (undocumented:-) kernel behavior is not clear.
 We use two options for three separate operations: MS_REMOUNT (remount
 superblock), MS_BIND (loopback) and MS_REMOUNT & MS_BIND (change mnt
 flags).

> It would be much cleaner to have a mount option, say
> --change-mnt-flags, that is equivalent to "--bind -oremount".

 Maybe, the question is how many users already depend on mount --bind
 /old /new; mount -o remount,ro /new.

    Karel

-- 
 Karel Zak  <kzak@redhat.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Remount root RO after the root dentry drops from the namespace
  2009-11-18 17:08           ` Karel Zak
@ 2009-11-18 17:38             ` Miklos Szeredi
  0 siblings, 0 replies; 9+ messages in thread
From: Miklos Szeredi @ 2009-11-18 17:38 UTC (permalink / raw)
  To: Karel Zak; +Cc: miklos, dsd, linux-fsdevel, mtk.manpages

On Wed, 18 Nov 2009, Karel Zak wrote:
>  It's kernel who cares about M_REMOUNT & MS_BIND. And kernel
>  MS_REMOUNT behavior was silently changed one year ago.

Yeah, that's an ABI breakage.  Luckily nobody seemed to care, so
everything is fine :)

>  The mount(8) command has exactly defined way how works with mount
>  options for many years. There is not any exception for MS_BIND. It
>  reads options from fstab/mtab and apply options from command line,
>  the result is send to kernel.

That's debatable, MS_BIND is not a mount option, it's a mount
*action*.  Just like MS_REMOUNT.  So the way mount(8) handles MS_BIND
is rather inconsistent.

>  I think the current (undocumented:-) kernel behavior is not clear.

CC-ing Michael Kerrisk.  Michael, could you please add an explanation
of (MS_REMOUNT & MS_BIND) to the mount(2) man page?  When these flags
are used together, they change the per-mount flags (6-th field in
/proc/self/mountinfo), not the per-superblock flags (last field in
/proc/self/mountinfo).


>  We use two options for three separate operations: MS_REMOUNT (remount
>  superblock), MS_BIND (loopback) and MS_REMOUNT & MS_BIND (change mnt
>  flags).

The mount(2) API is a big mess...

> > It would be much cleaner to have a mount option, say
> > --change-mnt-flags, that is equivalent to "--bind -oremount".
> 
>  Maybe, the question is how many users already depend on mount --bind
>  /old /new; mount -o remount,ro /new.

I have no idea.  I just know that the current behavior is inconsistent
and constraining.  And it won't help migrating mount(8) off of
/etc/mtab.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-11-18 17:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-18 11:12 Remount root RO after the root dentry drops from the namespace Daniel Drake
2009-11-18 11:50 ` Miklos Szeredi
2009-11-18 12:38   ` Daniel Drake
2009-11-18 12:58     ` Miklos Szeredi
2009-11-18 14:52       ` Karel Zak
2009-11-18 15:28         ` Miklos Szeredi
2009-11-18 17:08           ` Karel Zak
2009-11-18 17:38             ` Miklos Szeredi
2009-11-18 15:18       ` Daniel Drake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).