* read only bind mount ignores ready only @ 2013-12-11 14:37 Phillip Susi 2013-12-11 16:49 ` Karel Zak 2013-12-12 12:05 ` Karel Zak 0 siblings, 2 replies; 9+ messages in thread From: Phillip Susi @ 2013-12-11 14:37 UTC (permalink / raw) To: util-linux -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Forwarding report from https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/712892 It seems that the kernel has a bug where it silently ignores the MS_RDONLY flag when creating a bind mount. mount issues a warning that the mount point appears to be read-write even though you requested read only. The reporter suggests a patch to automatically attempt to remount with MS_RDONLY before issuing this warning to work around the kernel bug. What do you think? -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSqHjFAAoJEI5FoCIzSKrwP5wH/3vfHcUVra8Zh2GUcTEMU7ex BEed+jb4KYeuuISO8wxrkGb7eRAw/mHQTTPmVPjouWbG0s7AMXb/k1JQw3VEwtPA 7Mm8Y6jZoMJTiHvegWAKCWiaKcZ2ututJa23OP7RAgWJeGoZVdRtpRCyC6XOT3ES anUdwKpoZgDILKMdi+ssgfDVjPgDpaOluHkXLvhPlYyYiHb7WeAjEWGryTCt/vXq 74CjnD0l07Ryvg0ZNehxLQG6YJqQyNK69MUlDfNo3Tr66oeZNfbvso2Npkhvi6/E Ut+6dJr0Z9fasfHb88UmhmLUcSCKM6HRGy29fSKQ3UJDpD77o7smPr/p51YaX0M= =x0mB -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-11 14:37 read only bind mount ignores ready only Phillip Susi @ 2013-12-11 16:49 ` Karel Zak 2013-12-12 12:05 ` Karel Zak 1 sibling, 0 replies; 9+ messages in thread From: Karel Zak @ 2013-12-11 16:49 UTC (permalink / raw) To: Phillip Susi; +Cc: util-linux On Wed, Dec 11, 2013 at 09:37:57AM -0500, Phillip Susi wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Forwarding report from > https://bugs.launchpad.net/ubuntu/+source/util-linux/+bug/712892 > > It seems that the kernel has a bug where it silently ignores the > MS_RDONLY flag when creating a bind mount. mount issues a warning Yes, this is known issue. > that the mount point appears to be read-write even though you I think that libmount based mount does not warn about it (mistake?). > requested read only. The reporter suggests a patch to automatically > attempt to remount with MS_RDONLY before issuing this warning to work > around the kernel bug. What do you think? Well, it means that the kernel disadvantage will never be fixed ;-) It would be relatively simple to fix it, because libmount already support "additional mounts" to implement things like mount --make-private /dev/sda1 /mnt (kernel does not allow to use propagation flags for regular mount operation). I'll try it tomorrow. The problem is that all these userspace hacks does not atomic... Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-11 14:37 read only bind mount ignores ready only Phillip Susi 2013-12-11 16:49 ` Karel Zak @ 2013-12-12 12:05 ` Karel Zak 2013-12-12 14:59 ` Phillip Susi 2013-12-12 19:42 ` Miklos Szeredi 1 sibling, 2 replies; 9+ messages in thread From: Karel Zak @ 2013-12-12 12:05 UTC (permalink / raw) To: Phillip Susi; +Cc: util-linux, Miklos Szeredi [CC: kernel guys] On Wed, Dec 11, 2013 at 09:37:57AM -0500, Phillip Susi wrote: > It seems that the kernel has a bug where it silently ignores the > MS_RDONLY flag when creating a bind mount. mount issues a warning > that the mount point appears to be read-write even though you > requested read only. The reporter suggests a patch to automatically > attempt to remount with MS_RDONLY before issuing this warning to work > around the kernel bug. What do you think? I have it implemented, so mount --bind --read-only /mnt /mnt is interpreted as two requests (two mount(2) calls) mount --bind /mnt /mnt mount -o remount,bind,ro /tmp it works as expected, but it does not work with MS_REC (recursive) because kernel currently does not support MS_REMOUNT|MS_BIND|MS_REC|... it means that mount --rbind --read-only /mnt /mnt creates only top-level read-only mountpoint, the rest is unchanged. Miklos would be possible to fix kernel to accept MS_REC for MS_REMOUNT|MS_BIND|MS_RDONLY operation? Please. It seems that all we need is to call stuff in mnt_make_readonly() for all next_mnt() items. (Well, it would be also nice to learn kernel to support MS_BIND|MS_RDONLY, but it's probably more invasive change.) Lare; -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-12 12:05 ` Karel Zak @ 2013-12-12 14:59 ` Phillip Susi 2013-12-12 16:02 ` Karel Zak 2013-12-12 19:42 ` Miklos Szeredi 1 sibling, 1 reply; 9+ messages in thread From: Phillip Susi @ 2013-12-12 14:59 UTC (permalink / raw) To: Karel Zak; +Cc: util-linux, Miklos Szeredi -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/12/2013 7:05 AM, Karel Zak wrote: > I have it implemented, so > > mount --bind --read-only /mnt /mnt > > is interpreted as two requests (two mount(2) calls) > > mount --bind /mnt /mnt mount -o remount,bind,ro /tmp And mount -o bind,ro is the same right? So you can set up a ro bind mount in fstab? -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSqc89AAoJEI5FoCIzSKrwdhUH/R0lBLjHBtgK6BBrh+ULpRyJ o78QpMMrDH1qwui/MlCg1gXZe7ue6l7InIEfx3e62VBJTeMtIOUFKJB6Cvqt6/sb wtP3iUgqTqlD2L4FTmbX5hAB9b1XTYpfko4NIfFy6Xc92jgpPoDjQh9W47q5keQ1 N1HgHAG2iyFWrtkBYsFBFv1tFIKqXF59/oPPF70lQESJldmvYr8FHtSinYISIHAH Hcc+SVjTlUhQZVRb8teqcy7T8oAZu78NSqLXzgOG9uWooKduRG9Revye2/71tOHz 6PqPxreNXe+SgEHsQ2+R4tl4Rm1ttzMc7voJC04lDSIaxZJtPrxnylQnflndYjk= =EJIS -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-12 14:59 ` Phillip Susi @ 2013-12-12 16:02 ` Karel Zak 0 siblings, 0 replies; 9+ messages in thread From: Karel Zak @ 2013-12-12 16:02 UTC (permalink / raw) To: Phillip Susi; +Cc: util-linux, Miklos Szeredi On Thu, Dec 12, 2013 at 09:59:10AM -0500, Phillip Susi wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/12/2013 7:05 AM, Karel Zak wrote: > > I have it implemented, so > > > > mount --bind --read-only /mnt /mnt > > > > is interpreted as two requests (two mount(2) calls) > > > > mount --bind /mnt /mnt mount -o remount,bind,ro /tmp > > And mount -o bind,ro is the same right? So you can set up a ro bind > mount in fstab? Yes, but it is not in the git tree yet (I'd like to wait for Miklos's reply). Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-12 12:05 ` Karel Zak 2013-12-12 14:59 ` Phillip Susi @ 2013-12-12 19:42 ` Miklos Szeredi 2013-12-12 21:53 ` Al Viro 2013-12-13 8:18 ` Karel Zak 1 sibling, 2 replies; 9+ messages in thread From: Miklos Szeredi @ 2013-12-12 19:42 UTC (permalink / raw) To: Karel Zak; +Cc: Phillip Susi, util-linux, Linux-Fsdevel On Thu, Dec 12, 2013 at 1:05 PM, Karel Zak <kzak@redhat.com> wrote: > > [CC: kernel guys] > > On Wed, Dec 11, 2013 at 09:37:57AM -0500, Phillip Susi wrote: >> It seems that the kernel has a bug where it silently ignores the >> MS_RDONLY flag when creating a bind mount. mount issues a warning >> that the mount point appears to be read-write even though you >> requested read only. The reporter suggests a patch to automatically >> attempt to remount with MS_RDONLY before issuing this warning to work >> around the kernel bug. What do you think? > > I have it implemented, so > > mount --bind --read-only /mnt /mnt > > is interpreted as two requests (two mount(2) calls) > > mount --bind /mnt /mnt > mount -o remount,bind,ro /tmp > > it works as expected, but it does not work with MS_REC (recursive) > because kernel currently does not support > > MS_REMOUNT|MS_BIND|MS_REC|... > > it means that > > mount --rbind --read-only /mnt /mnt > > creates only top-level read-only mountpoint, the rest is unchanged. > > > Miklos would be possible to fix kernel to accept MS_REC for > MS_REMOUNT|MS_BIND|MS_RDONLY operation? Please. I really hate the current mount(2) API. It's a gigantic hack, and it's nearing the end of its life anyway due to flags running out. So instead of adding more hacks, I think it would be better to think about adding a couple of syscalls that have clearly defined semantics. Thanks, Miklos ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-12 19:42 ` Miklos Szeredi @ 2013-12-12 21:53 ` Al Viro 2013-12-13 10:45 ` Karel Zak 2013-12-13 8:18 ` Karel Zak 1 sibling, 1 reply; 9+ messages in thread From: Al Viro @ 2013-12-12 21:53 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Karel Zak, Phillip Susi, util-linux, Linux-Fsdevel On Thu, Dec 12, 2013 at 08:42:54PM +0100, Miklos Szeredi wrote: > I really hate the current mount(2) API. It's a gigantic hack, and it's > nearing the end of its life anyway due to flags running out. You and me and just about anyone who'd ever looked at that mess ;-/ > So instead of adding more hacks, I think it would be better to think > about adding a couple of syscalls that have clearly defined semantics. It's not just flags, unfortunately. Another problem stems from the fact that the normal case used to be "mount the filesystem from this block device on this directory", with additional flag added in v5 to indicate whether we want it rw or ro (v1 to v4 had everything rw). On any modern Unix, Linux included, that does not fit the reality. First of all, the main property of filesystem is not a block device - it's filesystem type. I.e. the real type of mount(2) (the normal case, after you shed all the cruft with remount, bind, etc.) is int (mountpoint, fs type, arguments specific for that fs type). What's more, type-specific arguments really are almost entirely up to fs driver. "The block device of given filesystem" is not a well-defined thing - it makes no sense for any network filesystem, for something like procfs, for something that lives in userland, or uses more than one block device, or lives on mtd device, etc. Furthermore, even for types that do live on a single block device we need more than just that device. Even back in 1974 (v5), they had to add a flag for rw vs. ro mounts. For a while it looked like it would be possible to keep it as bitmap (and the things were getting even more muddled by mixing the flags fs itself doesn't care about into the same thing - e.g. nosuid/nodev/noexec went there as well). Alas, the things got even nastier with NFS and its ilk - there had been too much extra data to hope to pack it into a bitmap (timeout, etc.). One approach had been type, mountpoint, flags, type-dependent pointer to struct with flags still being a mix of "fs itself doesn't give a damn" ones with ones that are very much for fs use (sync vs. async, for starters). Pointer to device name had been hidden inside that struct in cases when fs types needed one. The really messy part of that approach is a binary structure passed along, complete with alignment differences, size of pointer headache, marshalling for case of userland filesystems, etc. Moreover, mount(8) had to know the layouts of all these structures - after all, it has to build one from the text you've got in fstab. In practice that meant separate binaries for different fs types - mount_nfs, mount_xfs, etc., called by mount(8). That's more or less what *BSD had done. Much later FreeBSD tried to go for array of pairs, passed as an iovec (see nmount(2)). At least nobody has been deranged enough to pass XML... Linux started with v7-like (even pre-v5-like; there was no ro/rw flag) variant, proceed to type x device name x opaque other data and shortly after (in 0.97) to type x device name x flags x opaque other data. With opaque data being sometimes a string options, sometimes a binary structure. Led to all kinds of interesting headache for 32bit vs. 64bit userland later on; these days it has mostly converged to device name x flags x opaque option string - there are some exceptions, the worst offender being ncpfs. Note that device name is *also* opaque - it's interpreted by fs type. The parts of kernel outside of specific fs have no idea what to do with that thing; quite a few filesystems simply ignore it (common userland conventions include "none" or fs type name itself), some treat it as a pathname of block device, some interpret it as a mix of server name and path on server, etc. As far as the rest of the kernel (starting with VFS) is concerned, device name is a part of opaque triple passed along to fs driver. Another ugly thing is that e.g. ncpfs needs a non-trivial dialog with server and it's implemented thus: mount(2) is given enough information to connect to server and mount something. Server is not willing to give any fs contents yet, though, so all we see is an empty directory. mount(8) opens that directory and uses ioctl(2) to talk to server eventually that dialog with the server convinced it that we are to be allowed to mount the sucker. At that point the contents suddenly appears in the previously empty directory. No way for somebody looking at that empty directory to tell if it's genuinely empty fs imported from the server or just a half-authenticated one (you can see that ncpfs is mounted there, but that's it). Frankly, I wonder if we are trying to pack too much into one syscall - not just in terms of overloading it (that much is obvious), but in terms of trying to cram a sequence of syscalls into one. If we end up introducing new API(s) for mount(), it's probably worth considering something like this: * open a connection to fs type driver, get a descriptor * use normal IO syscalls (usually just write(2)) on that descriptor to tell fs type driver what do we want. If any kind of authentication is needed, that's the time for doing it * attach the thing identified by that descriptor to mountpoint I have an old writeup somewhere (several variants of it, actually) on possible replacement APIs; I'll try to dig it out and post it. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-12 21:53 ` Al Viro @ 2013-12-13 10:45 ` Karel Zak 0 siblings, 0 replies; 9+ messages in thread From: Karel Zak @ 2013-12-13 10:45 UTC (permalink / raw) To: Al Viro; +Cc: Miklos Szeredi, Phillip Susi, util-linux, Linux-Fsdevel On Thu, Dec 12, 2013 at 09:53:25PM +0000, Al Viro wrote: > Frankly, I wonder if we are trying to pack too much into one > syscall - not just in terms of overloading it (that much is obvious), > but in terms of trying to cram a sequence of syscalls into one. If > we end up introducing new API(s) for mount(), it's probably worth > considering something like this: > * open a connection to fs type driver, get a descriptor > * use normal IO syscalls (usually just write(2)) on that > descriptor to tell fs type driver what do we want. If any kind of > authentication is needed, that's the time for doing it > * attach the thing identified by that descriptor to mountpoint Yes, exactly. This is my wish for years. I don't think we need more *independent* syscalls to replace mount(2) (for example a special syscall to change propagation flags, or so). I strongly believe that APIs for complex tasks have to be based on handlers (file descriptors). These APIs are extendible. It would be also nice to provide some information about the mount operation to userspace by the file descriptor -- it means to support read(2) and at least to return mount Id. The current situation when we have only errno in userspace is insufficient. If you want to know more information then you have parse /proc/self/mountinfo, but which entry in the right entry for the last mount(2) call? > I have an old writeup somewhere (several variants of it, actually) on possible > replacement APIs; I'll try to dig it out and post it. Please, share it :-) Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: read only bind mount ignores ready only 2013-12-12 19:42 ` Miklos Szeredi 2013-12-12 21:53 ` Al Viro @ 2013-12-13 8:18 ` Karel Zak 1 sibling, 0 replies; 9+ messages in thread From: Karel Zak @ 2013-12-13 8:18 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Phillip Susi, util-linux, Linux-Fsdevel On Thu, Dec 12, 2013 at 08:42:54PM +0100, Miklos Szeredi wrote: > On Thu, Dec 12, 2013 at 1:05 PM, Karel Zak <kzak@redhat.com> wrote: > > Miklos would be possible to fix kernel to accept MS_REC for > > MS_REMOUNT|MS_BIND|MS_RDONLY operation? Please. > > I really hate the current mount(2) API. It's a gigantic hack, and it's We all hate it, but we have to use it every day.. > nearing the end of its life anyway due to flags running out. well, the current problem with MS_REC is just one small inconsistence in the current MS_REMOUNT|MS_BIND semantic. It would be really nice to fix it now. > So instead of adding more hacks, I think it would be better to think > about adding a couple of syscalls that have clearly defined semantics. Yes, but it's (very) long term goal... Karel -- Karel Zak <kzak@redhat.com> http://karelzak.blogspot.com ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-12-13 10:45 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-11 14:37 read only bind mount ignores ready only Phillip Susi 2013-12-11 16:49 ` Karel Zak 2013-12-12 12:05 ` Karel Zak 2013-12-12 14:59 ` Phillip Susi 2013-12-12 16:02 ` Karel Zak 2013-12-12 19:42 ` Miklos Szeredi 2013-12-12 21:53 ` Al Viro 2013-12-13 10:45 ` Karel Zak 2013-12-13 8:18 ` Karel Zak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox