* Re: [patch 0/8] unprivileged mount syscall [not found] <20070404183012.429274832@szeredi.hu> @ 2007-04-06 23:02 ` Andrew Morton 2007-04-06 23:16 ` H. Peter Anvin 2007-04-07 6:41 ` Miklos Szeredi 0 siblings, 2 replies; 36+ messages in thread From: Andrew Morton @ 2007-04-06 23:02 UTC (permalink / raw) To: Miklos Szeredi; +Cc: linux-fsdevel, util-linux-ng, containers, linux-kernel On Wed, 04 Apr 2007 20:30:12 +0200 Miklos Szeredi <miklos@szeredi.hu> wrote: > This patchset adds support for keeping mount ownership information in > the kernel, and allow unprivileged mount(2) and umount(2) in certain > cases. No replies, huh? My knowledge of the code which you're touching is not strong, and my spare reviewing capacity is not high. And this work does need close review by people who are familar with the code which you're changing. So could I suggest that you go for a dig through the git history, identify some individuals who look like they know this code, then do a resend, cc'ing those people? Please also cc linux-kernel on that resend. > This can be useful for the following reasons: > > - mount(8) can store ownership ("user=XY" option) in the kernel > instead, or in addition to storing it in /etc/mtab. For example if > private namespaces are used with mount propagations /etc/mtab > becomes unworkable, but using /proc/mounts works fine > > - fuse won't need a special suid-root mount/umount utility. Plain > umount(8) can easily be made to work with unprivileged fuse mounts > > - users can use bind mounts without having to pre-configure them in > /etc/fstab > > All this is done in a secure way, and unprivileged bind and fuse > mounts are disabled by default and can be enabled through sysctl or > /proc/sys. > > One thing that is missing from this series is the ability to restrict > user mounts to private namespaces. The reason is that private > namespaces have still not gained the momentum and support needed for > painless user experience. So such a feature would not yet get enough > attention and testing. However adding such an optional restriction > can be done with minimal changes in the future, once private > namespaces have matured. I suspect the people who developed and maintain nsproxy would disagree ;) Please also cc containers@lists.osdl.org. > An earlier version of these patches have been discussed here: > > http://lkml.org/lkml/2005/5/3/64 > > -- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-06 23:02 ` [patch 0/8] unprivileged mount syscall Andrew Morton @ 2007-04-06 23:16 ` H. Peter Anvin 2007-04-06 23:55 ` Jan Engelhardt 2007-04-10 8:52 ` Ian Kent 2007-04-07 6:41 ` Miklos Szeredi 1 sibling, 2 replies; 36+ messages in thread From: H. Peter Anvin @ 2007-04-06 23:16 UTC (permalink / raw) To: Andrew Morton Cc: Miklos Szeredi, linux-fsdevel, util-linux-ng, containers, linux-kernel >> >> - users can use bind mounts without having to pre-configure them in >> /etc/fstab >> This is by far the biggest concern I see. I think the security implication of allowing anyone to do bind mounts are poorly understood. -hpa ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-06 23:16 ` H. Peter Anvin @ 2007-04-06 23:55 ` Jan Engelhardt 2007-04-07 0:22 ` H. Peter Anvin 2007-04-10 8:52 ` Ian Kent 1 sibling, 1 reply; 36+ messages in thread From: Jan Engelhardt @ 2007-04-06 23:55 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, Miklos Szeredi, linux-fsdevel, util-linux-ng, containers, linux-kernel On Apr 6 2007 16:16, H. Peter Anvin wrote: >> > >> > - users can use bind mounts without having to pre-configure them in >> > /etc/fstab >> > > > This is by far the biggest concern I see. I think the security implication of > allowing anyone to do bind mounts are poorly understood. $ whoami miklos $ mount --bind / ~/down_under later that day: # userdel -r miklos So both the source (/) and target (~/down_under) directory must be owned by the user before --bind may succeed. There may be other implications hpa might want to fill us in. Regards, Jan -- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-06 23:55 ` Jan Engelhardt @ 2007-04-07 0:22 ` H. Peter Anvin 2007-04-07 3:40 ` Eric Van Hensbergen 0 siblings, 1 reply; 36+ messages in thread From: H. Peter Anvin @ 2007-04-07 0:22 UTC (permalink / raw) To: Jan Engelhardt Cc: Andrew Morton, Miklos Szeredi, linux-fsdevel, util-linux-ng, containers, linux-kernel Jan Engelhardt wrote: > On Apr 6 2007 16:16, H. Peter Anvin wrote: >>>> - users can use bind mounts without having to pre-configure them in >>>> /etc/fstab >>>> >> This is by far the biggest concern I see. I think the security implication of >> allowing anyone to do bind mounts are poorly understood. > > $ whoami > miklos > $ mount --bind / ~/down_under > > later that day: > # userdel -r miklos > > So both the source (/) and target (~/down_under) directory must be owned > by the user before --bind may succeed. > > There may be other implications hpa might want to fill us in. Consider backups, for example. -hpa ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-07 0:22 ` H. Peter Anvin @ 2007-04-07 3:40 ` Eric Van Hensbergen 2007-04-07 6:48 ` Miklos Szeredi 0 siblings, 1 reply; 36+ messages in thread From: Eric Van Hensbergen @ 2007-04-07 3:40 UTC (permalink / raw) To: H. Peter Anvin Cc: Jan Engelhardt, Andrew Morton, Miklos Szeredi, linux-fsdevel, util-linux-ng, containers, linux-kernel On 4/6/07, H. Peter Anvin <hpa@zytor.com> wrote: > Jan Engelhardt wrote: > > On Apr 6 2007 16:16, H. Peter Anvin wrote: > >>>> - users can use bind mounts without having to pre-configure them in > >>>> /etc/fstab > >>>> > >> This is by far the biggest concern I see. I think the security implication of > >> allowing anyone to do bind mounts are poorly understood. > > > > $ whoami > > miklos > > $ mount --bind / ~/down_under > > > > later that day: > > # userdel -r miklos > > > > Consider backups, for example. > This is the reason why enforcing private namespaces for user mounts makes sense. I think it catches many of these corner cases. -eric ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-07 3:40 ` Eric Van Hensbergen @ 2007-04-07 6:48 ` Miklos Szeredi 0 siblings, 0 replies; 36+ messages in thread From: Miklos Szeredi @ 2007-04-07 6:48 UTC (permalink / raw) To: ericvh Cc: hpa, jengelh, akpm, linux-fsdevel, util-linux-ng, containers, linux-kernel > On 4/6/07, H. Peter Anvin <hpa@zytor.com> wrote: > > Jan Engelhardt wrote: > > > On Apr 6 2007 16:16, H. Peter Anvin wrote: > > >>>> - users can use bind mounts without having to pre-configure them in > > >>>> /etc/fstab > > >>>> > > >> This is by far the biggest concern I see. I think the security implication of > > >> allowing anyone to do bind mounts are poorly understood. > > > > > > $ whoami > > > miklos > > > $ mount --bind / ~/down_under > > > > > > later that day: > > > # userdel -r miklos > > > > > > > Consider backups, for example. > > > > This is the reason why enforcing private namespaces for user mounts > makes sense. I think it catches many of these corner cases. Yes, disabling user bind mounts in the global namespace makes sense. Enabling user fuse mounts in the global namespace still works though, even if a little cludgy. All these nasty corner cases have been thought through and validated by a lot of users. Thanks, Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-06 23:16 ` H. Peter Anvin 2007-04-06 23:55 ` Jan Engelhardt @ 2007-04-10 8:52 ` Ian Kent 2007-04-11 10:48 ` Miklos Szeredi 1 sibling, 1 reply; 36+ messages in thread From: Ian Kent @ 2007-04-10 8:52 UTC (permalink / raw) To: H. Peter Anvin Cc: Andrew Morton, Miklos Szeredi, linux-fsdevel, util-linux-ng, containers, linux-kernel On Fri, 2007-04-06 at 16:16 -0700, H. Peter Anvin wrote: > >> > >> - users can use bind mounts without having to pre-configure them in > >> /etc/fstab > >> > > This is by far the biggest concern I see. I think the security > implication of allowing anyone to do bind mounts are poorly understood. And especially so since there is no way for a filesystem module to veto such requests. Ian ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-10 8:52 ` Ian Kent @ 2007-04-11 10:48 ` Miklos Szeredi 2007-04-11 13:48 ` Ian Kent 0 siblings, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-11 10:48 UTC (permalink / raw) To: raven; +Cc: hpa, akpm, linux-fsdevel, util-linux-ng, containers, linux-kernel > > >> > > >> - users can use bind mounts without having to pre-configure them in > > >> /etc/fstab > > >> > > > > This is by far the biggest concern I see. I think the security > > implication of allowing anyone to do bind mounts are poorly understood. > > And especially so since there is no way for a filesystem module to veto > such requests. The filesystem can't veto initial mounts based on destination either. I don't think it's up to the filesystem to police bind/move mounts in any way. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-11 10:48 ` Miklos Szeredi @ 2007-04-11 13:48 ` Ian Kent 2007-04-11 14:26 ` Serge E. Hallyn 0 siblings, 1 reply; 36+ messages in thread From: Ian Kent @ 2007-04-11 13:48 UTC (permalink / raw) To: Miklos Szeredi Cc: hpa, akpm, linux-fsdevel, util-linux-ng, containers, linux-kernel On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > >> > > > >> - users can use bind mounts without having to pre-configure them in > > > >> /etc/fstab > > > >> > > > > > > This is by far the biggest concern I see. I think the security > > > implication of allowing anyone to do bind mounts are poorly understood. > > > > And especially so since there is no way for a filesystem module to veto > > such requests. > > The filesystem can't veto initial mounts based on destination either. > I don't think it's up to the filesystem to police bind/move mounts in > any way. But if a filesystem can't or the developer thinks that it shouldn't for some reason, support bind/move mounts then there should be a way for the filesystem to tell the kernel that. Surely a filesystem is in a good position to be able to decide if a mount request "for it" should be allowed to continue based on it's "own situation and capabilities". Ian ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-11 13:48 ` Ian Kent @ 2007-04-11 14:26 ` Serge E. Hallyn 2007-04-11 14:27 ` Ian Kent 0 siblings, 1 reply; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-11 14:26 UTC (permalink / raw) To: Ian Kent Cc: Miklos Szeredi, hpa, akpm, linux-fsdevel, util-linux-ng, containers, linux-kernel Quoting Ian Kent (raven@themaw.net): > On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > > >> > > > > >> - users can use bind mounts without having to pre-configure them in > > > > >> /etc/fstab > > > > >> > > > > > > > > This is by far the biggest concern I see. I think the security > > > > implication of allowing anyone to do bind mounts are poorly understood. > > > > > > And especially so since there is no way for a filesystem module to veto > > > such requests. > > > > The filesystem can't veto initial mounts based on destination either. > > I don't think it's up to the filesystem to police bind/move mounts in > > any way. > > But if a filesystem can't or the developer thinks that it shouldn't for > some reason, support bind/move mounts then there should be a way for the Can you list some valid reasons why an fs could care where it is mounted? The only thing I could think of is a stackable fs, but it shouldn't care whether it is overlay-mounted or not. thanks, -serge > filesystem to tell the kernel that. > > Surely a filesystem is in a good position to be able to decide if a > mount request "for it" should be allowed to continue based on it's "own > situation and capabilities". > > Ian > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-11 14:26 ` Serge E. Hallyn @ 2007-04-11 14:27 ` Ian Kent 2007-04-11 14:45 ` Serge E. Hallyn 0 siblings, 1 reply; 36+ messages in thread From: Ian Kent @ 2007-04-11 14:27 UTC (permalink / raw) To: Serge E. Hallyn Cc: Miklos Szeredi, hpa, akpm, linux-fsdevel, util-linux-ng, containers, linux-kernel On Wed, 2007-04-11 at 09:26 -0500, Serge E. Hallyn wrote: > Quoting Ian Kent (raven@themaw.net): > > On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > > > >> > > > > > >> - users can use bind mounts without having to pre-configure them in > > > > > >> /etc/fstab > > > > > >> > > > > > > > > > > This is by far the biggest concern I see. I think the security > > > > > implication of allowing anyone to do bind mounts are poorly understood. > > > > > > > > And especially so since there is no way for a filesystem module to veto > > > > such requests. > > > > > > The filesystem can't veto initial mounts based on destination either. > > > I don't think it's up to the filesystem to police bind/move mounts in > > > any way. > > > > But if a filesystem can't or the developer thinks that it shouldn't for > > some reason, support bind/move mounts then there should be a way for the > > Can you list some valid reasons why an fs could care where it is > mounted? The only thing I could think of is a stackable fs, but it > shouldn't care whether it is overlay-mounted or not. For my part, autofs and autofs4. Moving or binding isn't valid. I tried to design that limitation out version 5 but wasn't able to. In time I probably can but couldn't continue to support older versions. > > thanks, > -serge > > > filesystem to tell the kernel that. > > > > Surely a filesystem is in a good position to be able to decide if a > > mount request "for it" should be allowed to continue based on it's "own > > situation and capabilities". > > > > Ian > > > > > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-11 14:27 ` Ian Kent @ 2007-04-11 14:45 ` Serge E. Hallyn 0 siblings, 0 replies; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-11 14:45 UTC (permalink / raw) To: Ian Kent Cc: Miklos Szeredi, hpa, akpm, linux-fsdevel, util-linux-ng, containers, linux-kernel Quoting Ian Kent (raven@themaw.net): > On Wed, 2007-04-11 at 09:26 -0500, Serge E. Hallyn wrote: > > Quoting Ian Kent (raven@themaw.net): > > > On Wed, 2007-04-11 at 12:48 +0200, Miklos Szeredi wrote: > > > > > > >> > > > > > > >> - users can use bind mounts without having to pre-configure them in > > > > > > >> /etc/fstab > > > > > > >> > > > > > > > > > > > > This is by far the biggest concern I see. I think the security > > > > > > implication of allowing anyone to do bind mounts are poorly understood. > > > > > > > > > > And especially so since there is no way for a filesystem module to veto > > > > > such requests. > > > > > > > > The filesystem can't veto initial mounts based on destination either. > > > > I don't think it's up to the filesystem to police bind/move mounts in > > > > any way. > > > > > > But if a filesystem can't or the developer thinks that it shouldn't for > > > some reason, support bind/move mounts then there should be a way for the > > > > Can you list some valid reasons why an fs could care where it is > > mounted? The only thing I could think of is a stackable fs, but it > > shouldn't care whether it is overlay-mounted or not. > > For my part, autofs and autofs4. Ah, thanks. I can see I'm going to have start using autofs to get to know the implementation, because it seems clear we'll run into it in the containers work again (beyond the struct pid conv) at some point. > Moving or binding isn't valid. > I tried to design that limitation out version 5 but wasn't able to. > In time I probably can but couldn't continue to support older versions. thanks, -serge > > > > thanks, > > -serge > > > > > filesystem to tell the kernel that. > > > > > > Surely a filesystem is in a good position to be able to decide if a > > > mount request "for it" should be allowed to continue based on it's "own > > > situation and capabilities". > > > > > > Ian > > > > > > > > > > > > - > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-06 23:02 ` [patch 0/8] unprivileged mount syscall Andrew Morton 2007-04-06 23:16 ` H. Peter Anvin @ 2007-04-07 6:41 ` Miklos Szeredi 2007-04-09 14:38 ` Serge E. Hallyn 1 sibling, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-07 6:41 UTC (permalink / raw) To: akpm; +Cc: linux-fsdevel, util-linux-ng, containers, linux-kernel > > This patchset adds support for keeping mount ownership information in > > the kernel, and allow unprivileged mount(2) and umount(2) in certain > > cases. > > No replies, huh? All we need is a comment from Andrew, and the replies come flooding in ;) > My knowledge of the code which you're touching is not strong, and my spare > reviewing capacity is not high. And this work does need close review by > people who are familar with the code which you're changing. > > So could I suggest that you go for a dig through the git history, identify > some individuals who look like they know this code, then do a resend, > cc'ing those people? Please also cc linux-kernel on that resend. OK. > > One thing that is missing from this series is the ability to restrict > > user mounts to private namespaces. The reason is that private > > namespaces have still not gained the momentum and support needed for > > painless user experience. So such a feature would not yet get enough > > attention and testing. However adding such an optional restriction > > can be done with minimal changes in the future, once private > > namespaces have matured. > > I suspect the people who developed and maintain nsproxy would disagree ;) Well, they better show me some working and simple-to-use userspace code, because I've not seen anything like that related to mount namespaces. pam_namespace.so is one example of a non-working, but probably-not-too- hard-to-fix one. I'm just saying this is not yet something that Joe Blow would just enable by ticking a box in their desktop setup wizard, and it would all work flawlessly thereafter. There's still a _long_ way towards that, and mostly in userspace. Thanks, Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-07 6:41 ` Miklos Szeredi @ 2007-04-09 14:38 ` Serge E. Hallyn 2007-04-09 16:24 ` Miklos Szeredi 0 siblings, 1 reply; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-09 14:38 UTC (permalink / raw) To: Miklos Szeredi Cc: akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel Quoting Miklos Szeredi (miklos@szeredi.hu): > > > This patchset adds support for keeping mount ownership information in > > > the kernel, and allow unprivileged mount(2) and umount(2) in certain > > > cases. > > > > No replies, huh? > > All we need is a comment from Andrew, and the replies come flooding in ;) > > > My knowledge of the code which you're touching is not strong, and my spare > > reviewing capacity is not high. And this work does need close review by > > people who are familar with the code which you're changing. > > > > So could I suggest that you go for a dig through the git history, identify > > some individuals who look like they know this code, then do a resend, > > cc'ing those people? Please also cc linux-kernel on that resend. > > OK. > > > > One thing that is missing from this series is the ability to restrict > > > user mounts to private namespaces. The reason is that private > > > namespaces have still not gained the momentum and support needed for > > > painless user experience. So such a feature would not yet get enough > > > attention and testing. However adding such an optional restriction > > > can be done with minimal changes in the future, once private > > > namespaces have matured. > > > > I suspect the people who developed and maintain nsproxy would disagree ;) > > Well, they better show me some working and simple-to-use userspace > code, because I've not seen anything like that related to mount > namespaces. If you mean to test/exploit them, see http://lxc.sourceforge.net/patches/2.6.20/2.6.20-lxc8/broken-out/tests/ Compile the ns_exec.c program and do ns_exec -m /bin/sh to get a shell in a new mounts namespace. > pam_namespace.so is one example of a non-working, but probably-not-too- > hard-to-fix one. Non-working? I sure hope the one used for LSPP certification is working... As is the ugly version I wrote 18 mounts ago and use on my laptop. > I'm just saying this is not yet something that Joe Blow would just > enable by ticking a box in their desktop setup wizard, and it would > all work flawlessly thereafter. There's still a _long_ way towards > that, and mostly in userspace. I'm not sure there's a that long a way to go, but clearly we need to be showing users what they can do, or they'll never work their way towards there. For instance, as you say, a user admin gui with a checkmark and text boxes saying 'enter new namespace on login', 'create private /tmp', and 'create private dmcrypted /home' would be trivial right now. -serge ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 14:38 ` Serge E. Hallyn @ 2007-04-09 16:24 ` Miklos Szeredi 2007-04-09 17:07 ` Serge E. Hallyn 0 siblings, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-09 16:24 UTC (permalink / raw) To: serue; +Cc: akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > > > > One thing that is missing from this series is the ability to restrict > > > > user mounts to private namespaces. The reason is that private > > > > namespaces have still not gained the momentum and support needed for > > > > painless user experience. So such a feature would not yet get enough > > > > attention and testing. However adding such an optional restriction > > > > can be done with minimal changes in the future, once private > > > > namespaces have matured. > > > > > > I suspect the people who developed and maintain nsproxy would disagree ;) > > > > Well, they better show me some working and simple-to-use userspace > > code, because I've not seen anything like that related to mount > > namespaces. > > If you mean to test/exploit them, see > http://lxc.sourceforge.net/patches/2.6.20/2.6.20-lxc8/broken-out/tests/ > > Compile the ns_exec.c program and do > > ns_exec -m /bin/sh > > to get a shell in a new mounts namespace. Cool, thanks. This is a very nice utility for testing, but for the end user rather useless: - user starts up a private namespace in a shell, mounts something - then opens app from menu, tries to access mount, but the mount is not there - user unhappy BTW, looking at -mm unshare() on namespace is not privileged any more. Why is that? Or rather, what's the reason, that clone() is privileged and unshare() is not? > > pam_namespace.so is one example of a non-working, but probably-not-too- > > hard-to-fix one. > > Non-working? I sure hope the one used for LSPP certification is > working... As is the ugly version I wrote 18 mounts ago and use on my > laptop. The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are you interested in the details? I can reproduce it, but forgot to note down the details of the brokenness. > > I'm just saying this is not yet something that Joe Blow would just > > enable by ticking a box in their desktop setup wizard, and it would > > all work flawlessly thereafter. There's still a _long_ way towards > > that, and mostly in userspace. > > I'm not sure there's a that long a way to go, but clearly we need to be > showing users what they can do, or they'll never work their way towards > there. There _is_ a long way to go. Random things that spring to my mind: - using /etc/mtab is broken with private namespaces, using /proc/mounts is missing various functionality, that /etc/mtab has, for example the "user" option, which this patchset adds - need to set up mount propagation from global namespace to private ones, mount(8) does not yet have options to configure propagation - user namespace setup: what if user has multiple sessions? 1) namespaces are shared? That's tricky because the session needs to be a child of a namespace server, not of login. I'm not sure PAM can handle this 2) or mounts are copied on login? That's not possible currently, as there's no way to send a mount between namespaces. Also it's tricky to make sure that new mounts are also shared > For instance, as you say, a user admin gui with a checkmark and text > boxes saying 'enter new namespace on login', 'create private /tmp', > and 'create private dmcrypted /home' would be trivial right now. Trivial modulo the above slightly non-trivial exemptions ;) Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 16:24 ` Miklos Szeredi @ 2007-04-09 17:07 ` Serge E. Hallyn 2007-04-09 17:46 ` Ram Pai 2007-04-09 20:10 ` Miklos Szeredi 0 siblings, 2 replies; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-09 17:07 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel, Ram Pai Quoting Miklos Szeredi (miklos@szeredi.hu): > > > > > One thing that is missing from this series is the ability to restrict > > > > > user mounts to private namespaces. The reason is that private > > > > > namespaces have still not gained the momentum and support needed for > > > > > painless user experience. So such a feature would not yet get enough > > > > > attention and testing. However adding such an optional restriction > > > > > can be done with minimal changes in the future, once private > > > > > namespaces have matured. > > > > > > > > I suspect the people who developed and maintain nsproxy would disagree ;) > > > > > > Well, they better show me some working and simple-to-use userspace > > > code, because I've not seen anything like that related to mount > > > namespaces. > > > > If you mean to test/exploit them, see > > http://lxc.sourceforge.net/patches/2.6.20/2.6.20-lxc8/broken-out/tests/ > > > > Compile the ns_exec.c program and do > > > > ns_exec -m /bin/sh > > > > to get a shell in a new mounts namespace. > > Cool, thanks. This is a very nice utility for testing, but for the > end user rather useless: Well that depends on which end-user. Those wanting to create a vserver or checkpoint-restart job will want this, but clearly we have a long way to go for that upstream anyway. > - user starts up a private namespace in a shell, mounts something > > - then opens app from menu, tries to access mount, but the mount is > not there > > - user unhappy > > BTW, looking at -mm unshare() on namespace is not privileged any more. > Why is that? Or rather, what's the reason, that clone() is privileged > and unshare() is not? The check is still there - see kernel/nsproxy.c:unshare_nsproxy_namespaces(). > > > pam_namespace.so is one example of a non-working, but probably-not-too- > > > hard-to-fix one. > > > > Non-working? I sure hope the one used for LSPP certification is > > working... As is the ugly version I wrote 18 mounts ago and use on my > > laptop. > > The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are > you interested in the details? I can reproduce it, but forgot to note > down the details of the brokenness. I don't know how far removed that is from the one being used by redhat, but assuming it's the same, then redhat-lspp@redhat.com will be very interested. > > > I'm just saying this is not yet something that Joe Blow would just > > > enable by ticking a box in their desktop setup wizard, and it would > > > all work flawlessly thereafter. There's still a _long_ way towards > > > that, and mostly in userspace. > > > > I'm not sure there's a that long a way to go, but clearly we need to be > > showing users what they can do, or they'll never work their way towards > > there. > > There _is_ a long way to go. Random things that spring to my mind: > > - using /etc/mtab is broken with private namespaces, using > /proc/mounts is missing various functionality, that /etc/mtab has, > for example the "user" option, which this patchset adds Agreed those need fixing. > - need to set up mount propagation from global namespace to private > ones, mount(8) does not yet have options to configure propagation Hmm, I guess I get lost using my own little systems, and just assumed that shared subtree functionality was making its way up into mount(8). Ram, have you been working on that? > - user namespace setup: what if user has multiple sessions? > > 1) namespaces are shared? That's tricky because the session needs to > be a child of a namespace server, not of login. I'm not sure PAM > can handle this > > 2) or mounts are copied on login? That's not possible currently, > as there's no way to send a mount between namespaces. Also it's > tricky to make sure that new mounts are also shared See toward the end of the 'shared subtrees' OLS paper from last year for a suggestion on how to let users effectively 'log in to' an existing private mounts ns. > > For instance, as you say, a user admin gui with a checkmark and text > > boxes saying 'enter new namespace on login', 'create private /tmp', > > and 'create private dmcrypted /home' would be trivial right now. > > Trivial modulo the above slightly non-trivial exemptions ;) Ok, so it can use some very non-trivial fine-tuning... But I've been using the above - minus the trivial gui - for over a year without ever worrying about any of these short-comings. > Miklos -serge ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 17:07 ` Serge E. Hallyn @ 2007-04-09 17:46 ` Ram Pai 2007-04-09 18:25 ` H. Peter Anvin 2007-04-10 10:33 ` Karel Zak 2007-04-09 20:10 ` Miklos Szeredi 1 sibling, 2 replies; 36+ messages in thread From: Ram Pai @ 2007-04-09 17:46 UTC (permalink / raw) To: Serge E. Hallyn Cc: Miklos Szeredi, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Mon, 2007-04-09 at 12:07 -0500, Serge E. Hallyn wrote: > Quoting Miklos Szeredi (miklos@szeredi.hu): > > - need to set up mount propagation from global namespace to private > > ones, mount(8) does not yet have options to configure propagation > > Hmm, I guess I get lost using my own little systems, and just assumed > that shared subtree functionality was making its way up into mount(8). > Ram, have you been working on that? It is in FC6. I dont know the status off upstream util-linux. I did submit the patch many times to Adrian Bunk (the then util-linux maintainer) and got no response. I have not pushed the patches to the new maintainer(Karel Zak?) though. RP ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 17:46 ` Ram Pai @ 2007-04-09 18:25 ` H. Peter Anvin 2007-04-10 10:33 ` Karel Zak 1 sibling, 0 replies; 36+ messages in thread From: H. Peter Anvin @ 2007-04-09 18:25 UTC (permalink / raw) To: Ram Pai Cc: Serge E. Hallyn, Miklos Szeredi, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel Ram Pai wrote: > > It is in FC6. I dont know the status off upstream util-linux. I did > submit the patch many times to Adrian Bunk (the then util-linux > maintainer) and got no response. I have not pushed the patches to the > new maintainer(Karel Zak?) though. > Well, do that, then :) Seriously. The whole point of util-linux-ng is to make forward progress. -hpa ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 17:46 ` Ram Pai 2007-04-09 18:25 ` H. Peter Anvin @ 2007-04-10 10:33 ` Karel Zak 1 sibling, 0 replies; 36+ messages in thread From: Karel Zak @ 2007-04-10 10:33 UTC (permalink / raw) To: Ram Pai Cc: Serge E. Hallyn, Miklos Szeredi, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Mon, Apr 09, 2007 at 10:46:25AM -0700, Ram Pai wrote: > On Mon, 2007-04-09 at 12:07 -0500, Serge E. Hallyn wrote: > > Quoting Miklos Szeredi (miklos@szeredi.hu): > > > > - need to set up mount propagation from global namespace to private > > > ones, mount(8) does not yet have options to configure propagation > > > > Hmm, I guess I get lost using my own little systems, and just assumed > > that shared subtree functionality was making its way up into mount(8). > > Ram, have you been working on that? > > It is in FC6. I dont know the status off upstream util-linux. I did > submit the patch many times to Adrian Bunk (the then util-linux > maintainer) and got no response. I have not pushed the patches to the > new maintainer(Karel Zak?) though. The "shared-subtree" patch has been applied: http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=commitdiff;h=389fbea536e4308d9475fa2a89e53e188ce8a0e3;hp=939a997de0c761d29fb7530976ca20da4898703a Karel -- Karel Zak <kzak@redhat.com> ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 17:07 ` Serge E. Hallyn 2007-04-09 17:46 ` Ram Pai @ 2007-04-09 20:10 ` Miklos Szeredi 2007-04-10 8:38 ` Ram Pai 1 sibling, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-09 20:10 UTC (permalink / raw) To: serue Cc: akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel, linuxram > > The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are > > you interested in the details? I can reproduce it, but forgot to note > > down the details of the brokenness. > > I don't know how far removed that is from the one being used by redhat, > but assuming it's the same, then redhat-lspp@redhat.com will be > very interested. OK. > > - user namespace setup: what if user has multiple sessions? > > > > 1) namespaces are shared? That's tricky because the session needs to > > be a child of a namespace server, not of login. I'm not sure PAM > > can handle this > > > > 2) or mounts are copied on login? That's not possible currently, > > as there's no way to send a mount between namespaces. Also it's > > tricky to make sure that new mounts are also shared > > See toward the end of the 'shared subtrees' OLS paper from last year for > a suggestion on how to let users effectively 'log in to' an existing > private mounts ns. This? 1. create a new namespace 2. bind /share/$USER to /share 3. for each pair ($who, $what) such that /share/$USER/$who/$what exists, look in /share/$who/allowed for "peer $what $USER" or "slave $what $USER". If the former is found, rbind /share/$who/$what on /share/$USER/$who/$what; if the latter is found, do the same and follow with marking subtree under /share/$USER/$who/$what as slave. 4. rbind /share/$USER to /share 5. mark subtree under /share as private. 6. umount -l /share Well, someone please explain using short words, because I don't understand at all. Thanks, Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-09 20:10 ` Miklos Szeredi @ 2007-04-10 8:38 ` Ram Pai 2007-04-11 10:44 ` Miklos Szeredi 0 siblings, 1 reply; 36+ messages in thread From: Ram Pai @ 2007-04-10 8:38 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Mon, 2007-04-09 at 22:10 +0200, Miklos Szeredi wrote: > > > The one in pam-0.99.6.3-29.1 in opensuse-10.2 is totally broken. Are > > > you interested in the details? I can reproduce it, but forgot to note > > > down the details of the brokenness. > > > > I don't know how far removed that is from the one being used by redhat, > > but assuming it's the same, then redhat-lspp@redhat.com will be > > very interested. > > OK. > > > > - user namespace setup: what if user has multiple sessions? > > > > > > 1) namespaces are shared? That's tricky because the session needs to > > > be a child of a namespace server, not of login. I'm not sure PAM > > > can handle this > > > > > > 2) or mounts are copied on login? That's not possible currently, > > > as there's no way to send a mount between namespaces. Also it's > > > tricky to make sure that new mounts are also shared > > > > See toward the end of the 'shared subtrees' OLS paper from last year for > > a suggestion on how to let users effectively 'log in to' an existing > > private mounts ns. > > This? > > 1. create a new namespace > 2. bind /share/$USER to /share > 3. for each pair ($who, $what) such that > /share/$USER/$who/$what exists, look > in /share/$who/allowed for "peer $what > $USER" or "slave $what $USER". If the > former is found, rbind /share/$who/$what > on /share/$USER/$who/$what; if the > latter is found, do the same and > follow with marking subtree under > /share/$USER/$who/$what as slave. > 4. rbind /share/$USER to /share > 5. mark subtree under /share as private. > 6. umount -l /share > > Well, someone please explain using short words, because I don't > understand at all. I am trying to re-construct Viro's thoughts. I think the steps outlined above; though not accurate, are still insightful. The idea is -- there is one master namespace, which has under /share, a replica of the mount tree of namespaces belonging to all users. for example if there are two users A and B, then in the master namespace under /share you will find /share/A and /share/B, each reflecting the mount tree for the namespaces belonging to user-A and user-B respectively. Note: /share is a shared mount-tree, which means it can propagate mount events. Everytime the user logs on the machine, a new namespace is created which is the clone of the master namespace. In this new namespace, the /share/$user is made the root of the namespace. Also if other users have allowed part of their namespace available to this user, than those mounts are also brought under this namespace. And finally the entire tree under /share is unmounted. Note, though multiple namespaces can exist simultaneously for the same user, the user is provided the illusion of per-process-namespace since all the namespaces look identical. I am trying to rewrite the steps outlined above, which may or may not reflect Viro's thoughts, but certainly reflect my reconstruction of viro's thoughts. 1. clone the master namespace. 2. in the new namespace move the tree under /share/$me to / for each ($user, $what, $how) { move /share/$user/$what to /$what if ($how == slave) { make the mount tree under /$what as slave } } 3. in the new namespace make the tree under /share as private and unmount /share RP > > Thanks, > Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-10 8:38 ` Ram Pai @ 2007-04-11 10:44 ` Miklos Szeredi 2007-04-11 18:28 ` Ram Pai 0 siblings, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-11 10:44 UTC (permalink / raw) To: linuxram Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > 1. clone the master namespace. > > 2. in the new namespace > > move the tree under /share/$me to / > for each ($user, $what, $how) { > move /share/$user/$what to /$what > if ($how == slave) { > make the mount tree under /$what as slave > } > } > > 3. in the new namespace make the tree under > /share as private and unmount /share Thanks. I get the basic idea now: the namespace itself need not be shared between the sessions, it is enough if "share" propagation is set up between the different namespaces of a user. I don't yet see either in your or Viro's description how the trees under /share/$USER are initialized. I guess they are recursively bound from /, and are made slaves. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-11 10:44 ` Miklos Szeredi @ 2007-04-11 18:28 ` Ram Pai 2007-04-13 11:58 ` Miklos Szeredi 0 siblings, 1 reply; 36+ messages in thread From: Ram Pai @ 2007-04-11 18:28 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > 1. clone the master namespace. > > > > 2. in the new namespace > > > > move the tree under /share/$me to / > > for each ($user, $what, $how) { > > move /share/$user/$what to /$what > > if ($how == slave) { > > make the mount tree under /$what as slave > > } > > } > > > > 3. in the new namespace make the tree under > > /share as private and unmount /share > > Thanks. I get the basic idea now: the namespace itself need not be > shared between the sessions, it is enough if "share" propagation is > set up between the different namespaces of a user. > > I don't yet see either in your or Viro's description how the trees > under /share/$USER are initialized. I guess they are recursively > bound from /, and are made slaves. yes. I suppose, when a userid is created one of the steps would be mount --rbind / /share/$USER mount --make-rslave /share/$USER mount --make-rshared /share/$USER RP > Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-11 18:28 ` Ram Pai @ 2007-04-13 11:58 ` Miklos Szeredi 2007-04-13 13:28 ` Serge E. Hallyn ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Miklos Szeredi @ 2007-04-13 11:58 UTC (permalink / raw) To: linuxram Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > 1. clone the master namespace. > > > > > > 2. in the new namespace > > > > > > move the tree under /share/$me to / > > > for each ($user, $what, $how) { > > > move /share/$user/$what to /$what > > > if ($how == slave) { > > > make the mount tree under /$what as slave > > > } > > > } > > > > > > 3. in the new namespace make the tree under > > > /share as private and unmount /share > > > > Thanks. I get the basic idea now: the namespace itself need not be > > shared between the sessions, it is enough if "share" propagation is > > set up between the different namespaces of a user. > > > > I don't yet see either in your or Viro's description how the trees > > under /share/$USER are initialized. I guess they are recursively > > bound from /, and are made slaves. > > yes. I suppose, when a userid is created one of the steps would be > > mount --rbind / /share/$USER > mount --make-rslave /share/$USER > mount --make-rshared /share/$USER Thinking a bit more about this, I'm quite sure most users wouldn't even want private namespaces. It would be enough to chroot /share/$USER and be done with it. Private namespaces are only good for keeping a bunch of mounts referenced by a group of processes. But my guess is, that the natural behavior for users is to see a persistent set of mounts. If for example they mount something on a remote machine, then log out from the ssh session and later log back in, they would want to see their previous mount still there. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 11:58 ` Miklos Szeredi @ 2007-04-13 13:28 ` Serge E. Hallyn 2007-04-13 14:05 ` Miklos Szeredi 2007-04-13 20:07 ` Karel Zak 2007-04-16 7:59 ` Ram Pai 2 siblings, 1 reply; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-13 13:28 UTC (permalink / raw) To: Miklos Szeredi Cc: linuxram, serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel Quoting Miklos Szeredi (miklos@szeredi.hu): > > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > > 1. clone the master namespace. > > > > > > > > 2. in the new namespace > > > > > > > > move the tree under /share/$me to / > > > > for each ($user, $what, $how) { > > > > move /share/$user/$what to /$what > > > > if ($how == slave) { > > > > make the mount tree under /$what as slave > > > > } > > > > } > > > > > > > > 3. in the new namespace make the tree under > > > > /share as private and unmount /share > > > > > > Thanks. I get the basic idea now: the namespace itself need not be > > > shared between the sessions, it is enough if "share" propagation is > > > set up between the different namespaces of a user. > > > > > > I don't yet see either in your or Viro's description how the trees > > > under /share/$USER are initialized. I guess they are recursively > > > bound from /, and are made slaves. > > > > yes. I suppose, when a userid is created one of the steps would be > > > > mount --rbind / /share/$USER > > mount --make-rslave /share/$USER > > mount --make-rshared /share/$USER > > Thinking a bit more about this, I'm quite sure most users wouldn't > even want private namespaces. It would be enough to > > chroot /share/$USER > > and be done with it. > > Private namespaces are only good for keeping a bunch of mounts > referenced by a group of processes. But my guess is, that the natural > behavior for users is to see a persistent set of mounts. > > If for example they mount something on a remote machine, then log out > from the ssh session and later log back in, they would want to see > their previous mount still there. > > Miklos Agreed on desired behavior, but not on chroot sufficing. It actually sounds like you want exactly what was outlined in the OLS paper. Users still need to be in a different mounts namespace from the admin user so long as we consider the deluser and backup problems to be legitimate problems (well, so long as user mounts are allowed). So, when they log in, pam gives them a new namespace and chroots them into /share/$USER. Assuming I'm thinking clearly :) -serge ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 13:28 ` Serge E. Hallyn @ 2007-04-13 14:05 ` Miklos Szeredi 2007-04-13 21:44 ` Serge E. Hallyn 2007-04-16 8:18 ` Ram Pai 0 siblings, 2 replies; 36+ messages in thread From: Miklos Szeredi @ 2007-04-13 14:05 UTC (permalink / raw) To: serue Cc: linuxram, serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > > Thinking a bit more about this, I'm quite sure most users wouldn't > > even want private namespaces. It would be enough to > > > > chroot /share/$USER > > > > and be done with it. > > > > Private namespaces are only good for keeping a bunch of mounts > > referenced by a group of processes. But my guess is, that the natural > > behavior for users is to see a persistent set of mounts. > > > > If for example they mount something on a remote machine, then log out > > from the ssh session and later log back in, they would want to see > > their previous mount still there. > > > > Miklos > > Agreed on desired behavior, but not on chroot sufficing. It actually > sounds like you want exactly what was outlined in the OLS paper. > > Users still need to be in a different mounts namespace from the admin > user so long as we consider the deluser and backup problems I don't think it matters, because /share/$USER duplicates a part or the whole of the user's namespace. So backup would have to be taught about /share anyway, and deluser operates on /home/$USER and not on /share/*, so there shouldn't be any problem. There's actually very little difference between rbind+chroot, and CLONE_NEWNS. In a private namespace: 1) when no more processes reference the namespace, the tree will be disbanded 2) the mount tree won't be accessible from outside the namespace Wanting a persistent namespace contradicts 1). Wanting a per-user (as opposed to per-session) namespace contradicts 2). The namespace _has_ to be accessible from outside, so that a new session can access/copy it. So both requirements point to the rbind/chroot solution. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 14:05 ` Miklos Szeredi @ 2007-04-13 21:44 ` Serge E. Hallyn 2007-04-15 20:39 ` Miklos Szeredi 2007-04-16 8:18 ` Ram Pai 1 sibling, 1 reply; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-13 21:44 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, linuxram, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel Quoting Miklos Szeredi (miklos@szeredi.hu): > > > Thinking a bit more about this, I'm quite sure most users wouldn't > > > even want private namespaces. It would be enough to > > > > > > chroot /share/$USER > > > > > > and be done with it. > > > > > > Private namespaces are only good for keeping a bunch of mounts > > > referenced by a group of processes. But my guess is, that the natural > > > behavior for users is to see a persistent set of mounts. > > > > > > If for example they mount something on a remote machine, then log out > > > from the ssh session and later log back in, they would want to see > > > their previous mount still there. > > > > > > Miklos > > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > sounds like you want exactly what was outlined in the OLS paper. > > > > Users still need to be in a different mounts namespace from the admin > > user so long as we consider the deluser and backup problems > > I don't think it matters, because /share/$USER duplicates a part or > the whole of the user's namespace. > > So backup would have to be taught about /share anyway, and deluser > operates on /home/$USER and not on /share/*, so there shouldn't be any > problem. In what I was thinking of, /share/$USER is bind mounted to ~$USER/share, so it would have to be done in a private namespace in order for deluser to not be tricked. > There's actually very little difference between rbind+chroot, and > CLONE_NEWNS. In a private namespace: > > 1) when no more processes reference the namespace, the tree will be > disbanded > > 2) the mount tree won't be accessible from outside the namespace But it *can* be, if properly set up. That's part of the point of the example in the OLS paper. When a user logs in, sshd clones a new namespace, then bind-mounts /share/$USER into ~$USER/share. So assuming that /share/$USER was --make-shared'd, it and ~$USER are now in the same peer group, and any changes made by the user under ~$USER will be reflected back into /share/$USER. > Wanting a persistent namespace contradicts 1). Not necessarily, see above. > Wanting a per-user (as opposed to per-session) namespace contradicts > 2). The namespace _has_ to be accessible from outside, so that a new > session can access/copy it. Again, I *think* you are wrong that private namespace contradicts this requirement. > So both requirements point to the rbind/chroot solution. It all points to a combination of the two :-) -serge ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 21:44 ` Serge E. Hallyn @ 2007-04-15 20:39 ` Miklos Szeredi 2007-04-16 1:11 ` Serge E. Hallyn 0 siblings, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-15 20:39 UTC (permalink / raw) To: serue Cc: miklos, serue, linuxram, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > > sounds like you want exactly what was outlined in the OLS paper. > > > > > > Users still need to be in a different mounts namespace from the admin > > > user so long as we consider the deluser and backup problems > > > > I don't think it matters, because /share/$USER duplicates a part or > > the whole of the user's namespace. > > > > So backup would have to be taught about /share anyway, and deluser > > operates on /home/$USER and not on /share/*, so there shouldn't be any > > problem. > > In what I was thinking of, /share/$USER is bind mounted to > ~$USER/share, so it would have to be done in a private namespace in > order for deluser to not be tricked. But /share/$USER is surely not bind mounted to ~$USER/share in the _global_ namespace, is it? I can't see any sense in that. > > There's actually very little difference between rbind+chroot, and > > CLONE_NEWNS. In a private namespace: > > > > 1) when no more processes reference the namespace, the tree will be > > disbanded > > > > 2) the mount tree won't be accessible from outside the namespace > > But it *can* be, if properly set up. That's part of the point of the > example in the OLS paper. When a user logs in, sshd clones a new > namespace, then bind-mounts /share/$USER into ~$USER/share. So assuming > that /share/$USER was --make-shared'd, it and ~$USER are now in the > same peer group, and any changes made by the user under ~$USER will > be reflected back into /share/$USER. I acknowledge, that it can be done. My point was that it can be done more simply _without_ using CLONE_NS. > > Wanting a persistent namespace contradicts 1). > > Not necessarily, see above. > > > Wanting a per-user (as opposed to per-session) namespace contradicts > > 2). The namespace _has_ to be accessible from outside, so that a new > > session can access/copy it. > > Again, I *think* you are wrong that private namespace contradicts this > requirement. I'm not saying there's any contradiction, I'm saying rbind+chroot is a better fit. I haven't yet heard a single reason why a per-session namespace with parts shared per-user is better than just a per-user namespace. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-15 20:39 ` Miklos Szeredi @ 2007-04-16 1:11 ` Serge E. Hallyn 0 siblings, 0 replies; 36+ messages in thread From: Serge E. Hallyn @ 2007-04-16 1:11 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, linuxram, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel Quoting Miklos Szeredi (miklos@szeredi.hu): > > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > > > sounds like you want exactly what was outlined in the OLS paper. > > > > > > > > Users still need to be in a different mounts namespace from the admin > > > > user so long as we consider the deluser and backup problems > > > > > > I don't think it matters, because /share/$USER duplicates a part or > > > the whole of the user's namespace. > > > > > > So backup would have to be taught about /share anyway, and deluser > > > operates on /home/$USER and not on /share/*, so there shouldn't be any > > > problem. > > > > In what I was thinking of, /share/$USER is bind mounted to > > ~$USER/share, so it would have to be done in a private namespace in > > order for deluser to not be tricked. > > But /share/$USER is surely not bind mounted to ~$USER/share in the > _global_ namespace, is it? I can't see any sense in that. No it's not, only in the private namespace. > > > There's actually very little difference between rbind+chroot, and > > > CLONE_NEWNS. In a private namespace: > > > > > > 1) when no more processes reference the namespace, the tree will be > > > disbanded > > > > > > 2) the mount tree won't be accessible from outside the namespace > > > > But it *can* be, if properly set up. That's part of the point of the > > example in the OLS paper. When a user logs in, sshd clones a new > > namespace, then bind-mounts /share/$USER into ~$USER/share. So assuming > > that /share/$USER was --make-shared'd, it and ~$USER are now in the > > same peer group, and any changes made by the user under ~$USER will > > be reflected back into /share/$USER. > > I acknowledge, that it can be done. My point was that it can be done > more simply _without_ using CLONE_NS. Seems like a matter of preference, but I see what you're saying. > > > Wanting a persistent namespace contradicts 1). > > > > Not necessarily, see above. > > > > > Wanting a per-user (as opposed to per-session) namespace contradicts > > > 2). The namespace _has_ to be accessible from outside, so that a new > > > session can access/copy it. > > > > Again, I *think* you are wrong that private namespace contradicts this > > requirement. > > I'm not saying there's any contradiction, I'm saying rbind+chroot is a > better fit. Ok, I see. > I haven't yet heard a single reason why a per-session namespace with > parts shared per-user is better than just a per-user namespace. In fact I suspect we could show that they are functionally equivalent (for your purposes) by drawing the fs tree and peer groups from current->fs->root on up for both methods. And not using private namespaces leaves the admin (at least for now) better able to diagnose the state of the system. -serge ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 14:05 ` Miklos Szeredi 2007-04-13 21:44 ` Serge E. Hallyn @ 2007-04-16 8:18 ` Ram Pai 2007-04-16 9:27 ` Miklos Szeredi 1 sibling, 1 reply; 36+ messages in thread From: Ram Pai @ 2007-04-16 8:18 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Fri, 2007-04-13 at 16:05 +0200, Miklos Szeredi wrote: > > > Thinking a bit more about this, I'm quite sure most users wouldn't > > > even want private namespaces. It would be enough to > > > > > > chroot /share/$USER > > > > > > and be done with it. > > > > > > Private namespaces are only good for keeping a bunch of mounts > > > referenced by a group of processes. But my guess is, that the natural > > > behavior for users is to see a persistent set of mounts. > > > > > > If for example they mount something on a remote machine, then log out > > > from the ssh session and later log back in, they would want to see > > > their previous mount still there. > > > > > > Miklos > > > > Agreed on desired behavior, but not on chroot sufficing. It actually > > sounds like you want exactly what was outlined in the OLS paper. > > > > Users still need to be in a different mounts namespace from the admin > > user so long as we consider the deluser and backup problems > > I don't think it matters, because /share/$USER duplicates a part or > the whole of the user's namespace. > > So backup would have to be taught about /share anyway, and deluser > operates on /home/$USER and not on /share/*, so there shouldn't be any > problem. > > There's actually very little difference between rbind+chroot, and > CLONE_NEWNS. In a private namespace: > > 1) when no more processes reference the namespace, the tree will be > disbanded > > 2) the mount tree won't be accessible from outside the namespace > > Wanting a persistent namespace contradicts 1). > > Wanting a per-user (as opposed to per-session) namespace contradicts > 2). The namespace _has_ to be accessible from outside, so that a new > session can access/copy it. As i mentioned in the previous mail, disbanding all the namespaces of a user will not disband his mount tree, because a mirror of the mount tree still continues to exist in /share/$USER in the admin namespace. And a new user session can always use this copy to create a namespace that looks identical to that which existed earlier. > > So both requirements point to the rbind/chroot solution. Arn't there ways to escape chroot jails? Serge had pointed me to a URL which showed chroots can be escaped. And if that is true than having all user's private mount tree in the same namespace can be a security issue? RP > > Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-16 8:18 ` Ram Pai @ 2007-04-16 9:27 ` Miklos Szeredi 2007-04-16 15:40 ` Eric W. Biederman 0 siblings, 1 reply; 36+ messages in thread From: Miklos Szeredi @ 2007-04-16 9:27 UTC (permalink / raw) To: linuxram Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > Arn't there ways to escape chroot jails? Serge had pointed me to a URL > which showed chroots can be escaped. And if that is true than having all > user's private mount tree in the same namespace can be a security issue? No. In fact chrooting the user into /share/$USER will actually _grant_ a privilege to the user, instead of taking it away. It allows the user to modify it's root namespace, which it wouldn't be able to in the initial namespace. So even if the user could escape from the chroot (which I doubt), s/he would not be able to do any harm, since unprivileged mounting would be restricted to /share. Also /share/$USER should only have read/search permission for $USER or no permissions at all, which would mean, that other users' namespaces would be safe from tampering as well. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-16 9:27 ` Miklos Szeredi @ 2007-04-16 15:40 ` Eric W. Biederman 2007-04-16 15:55 ` Miklos Szeredi 0 siblings, 1 reply; 36+ messages in thread From: Eric W. Biederman @ 2007-04-16 15:40 UTC (permalink / raw) To: Miklos Szeredi Cc: linuxram, containers, linux-fsdevel, akpm, util-linux-ng, linux-kernel Miklos Szeredi <miklos@szeredi.hu> writes: >> Arn't there ways to escape chroot jails? Serge had pointed me to a URL >> which showed chroots can be escaped. And if that is true than having all >> user's private mount tree in the same namespace can be a security issue? > > No. In fact chrooting the user into /share/$USER will actually > _grant_ a privilege to the user, instead of taking it away. It allows > the user to modify it's root namespace, which it wouldn't be able to > in the initial namespace. > > So even if the user could escape from the chroot (which I doubt), s/he > would not be able to do any harm, since unprivileged mounting would be > restricted to /share. Also /share/$USER should only have read/search > permission for $USER or no permissions at all, which would mean, that > other users' namespaces would be safe from tampering as well. A couple of points. - chroot can be escaped, it is just a chdir for the root directory it is not a security feature. The only security is that you have to be root to call chdir. A carefully done namespace setup won't have that issue. - While it may not violate security as far as what a user is allowed to modify it may violate security as far as what a user is allowed to see. There are interesting per login cases as well such as allowing a user to replicate their mount tree from another machine when they log in. When /home is on a network filesystem this can be very practical and can allow propagation of mounts across machines not just across a single login session. Eric ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-16 15:40 ` Eric W. Biederman @ 2007-04-16 15:55 ` Miklos Szeredi 0 siblings, 0 replies; 36+ messages in thread From: Miklos Szeredi @ 2007-04-16 15:55 UTC (permalink / raw) To: ebiederm Cc: linuxram, containers, linux-fsdevel, akpm, util-linux-ng, linux-kernel > >> Arn't there ways to escape chroot jails? Serge had pointed me to a URL > >> which showed chroots can be escaped. And if that is true than having all > >> user's private mount tree in the same namespace can be a security issue? > > > > No. In fact chrooting the user into /share/$USER will actually > > _grant_ a privilege to the user, instead of taking it away. It allows > > the user to modify it's root namespace, which it wouldn't be able to > > in the initial namespace. > > > > So even if the user could escape from the chroot (which I doubt), s/he > > would not be able to do any harm, since unprivileged mounting would be > > restricted to /share. Also /share/$USER should only have read/search > > permission for $USER or no permissions at all, which would mean, that > > other users' namespaces would be safe from tampering as well. > > A couple of points. > - chroot can be escaped, it is just a chdir for the root directory > it is not a security feature. The only security is that you have to > be root to call chdir. A carefully done namespace setup won't have > that issue. > > - While it may not violate security as far as what a user is allowed > to modify it may violate security as far as what a user is allowed > to see. I think that's just up to the permissions in the global namespace. In this example if you 'chmod 0 /share' there won't be anything for the user to see. > There are interesting per login cases as well such as allowing a > user to replicate their mount tree from another machine when they > log in. When /home is on a network filesystem this can be very > practical and can allow propagation of mounts across machines not > just across a single login session. Yeah, sounds interesting, but I think it's better to get the basics working first, and then we can start to think about the extras. Btw, there's nothing that prevents cloning the namespace _after_ chrooting into the per-user tree. That would still be simpler than doing it the other way round: first creating per-session namespaces and then setting up mount propagation between them. Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 11:58 ` Miklos Szeredi 2007-04-13 13:28 ` Serge E. Hallyn @ 2007-04-13 20:07 ` Karel Zak 2007-04-15 20:21 ` Miklos Szeredi 2007-04-16 7:59 ` Ram Pai 2 siblings, 1 reply; 36+ messages in thread From: Karel Zak @ 2007-04-13 20:07 UTC (permalink / raw) To: Miklos Szeredi Cc: linuxram, serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Fri, Apr 13, 2007 at 01:58:59PM +0200, Miklos Szeredi wrote: > > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > > 1. clone the master namespace. > > > > > > > > 2. in the new namespace > > > > > > > > move the tree under /share/$me to / > > > > for each ($user, $what, $how) { > > > > move /share/$user/$what to /$what > > > > if ($how == slave) { > > > > make the mount tree under /$what as slave > > > > } > > > > } > > > > > > > > 3. in the new namespace make the tree under > > > > /share as private and unmount /share > > > > > > Thanks. I get the basic idea now: the namespace itself need not be > > > shared between the sessions, it is enough if "share" propagation is > > > set up between the different namespaces of a user. > > > > > > I don't yet see either in your or Viro's description how the trees > > > under /share/$USER are initialized. I guess they are recursively > > > bound from /, and are made slaves. > > > > yes. I suppose, when a userid is created one of the steps would be > > > > mount --rbind / /share/$USER > > mount --make-rslave /share/$USER > > mount --make-rshared /share/$USER > > Thinking a bit more about this, I'm quite sure most users wouldn't > even want private namespaces. It would be enough to > > chroot /share/$USER > > and be done with it. I don't think so. How to you want to implement non-shared /tmp directories? The chroot is overkill in this case. See: http://www.coker.com.au/selinux/talks/sage-2006/PolyInstantiatedDirectories.html http://danwalsh.livejournal.com/ > Private namespaces are only good for keeping a bunch of mounts > referenced by a group of processes. But my guess is, that the natural > behavior for users is to see a persistent set of mounts. > > If for example they mount something on a remote machine, then log out > from the ssh session and later log back in, they would want to see > their previous mount still there. They can mount to /mnt where the directory is shared ("mount --make-shared /mnt") and visible and all namespaces. I think /share/$USER is an extreme example. You can found more situations when private namespaces are nice solution. Karel -- Karel Zak <kzak@redhat.com> ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 20:07 ` Karel Zak @ 2007-04-15 20:21 ` Miklos Szeredi 0 siblings, 0 replies; 36+ messages in thread From: Miklos Szeredi @ 2007-04-15 20:21 UTC (permalink / raw) To: kzak Cc: linuxram, serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel > > Thinking a bit more about this, I'm quite sure most users wouldn't > > even want private namespaces. It would be enough to > > > > chroot /share/$USER > > > > and be done with it. > > I don't think so. How to you want to implement non-shared /tmp > directories? mount --bind /.tmp/$USER /share/$USER/tmp or whatever else this polyunsaturated thingy does within the cloned namespace. > The chroot is overkill in this case. What do you mean it's an overkill? clone(CLONE_NS) duplicates all the mounts, just as mount --rbind does. > > Private namespaces are only good for keeping a bunch of mounts > > referenced by a group of processes. But my guess is, that the natural > > behavior for users is to see a persistent set of mounts. > > > > If for example they mount something on a remote machine, then log out > > from the ssh session and later log back in, they would want to see > > their previous mount still there. > > They can mount to /mnt where the directory is shared ("mount > --make-shared /mnt") and visible and all namespaces. > > I think /share/$USER is an extreme example. You can found more > situations when private namespaces are nice solution. Private to a single login session? I'd like to hear examples. Thanks, Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [patch 0/8] unprivileged mount syscall 2007-04-13 11:58 ` Miklos Szeredi 2007-04-13 13:28 ` Serge E. Hallyn 2007-04-13 20:07 ` Karel Zak @ 2007-04-16 7:59 ` Ram Pai 2 siblings, 0 replies; 36+ messages in thread From: Ram Pai @ 2007-04-16 7:59 UTC (permalink / raw) To: Miklos Szeredi Cc: serue, akpm, linux-fsdevel, containers, util-linux-ng, linux-kernel On Fri, 2007-04-13 at 13:58 +0200, Miklos Szeredi wrote: > > On Wed, 2007-04-11 at 12:44 +0200, Miklos Szeredi wrote: > > > > 1. clone the master namespace. > > > > > > > > 2. in the new namespace > > > > > > > > move the tree under /share/$me to / > > > > for each ($user, $what, $how) { > > > > move /share/$user/$what to /$what > > > > if ($how == slave) { > > > > make the mount tree under /$what as slave > > > > } > > > > } > > > > > > > > 3. in the new namespace make the tree under > > > > /share as private and unmount /share > > > > > > Thanks. I get the basic idea now: the namespace itself need not be > > > shared between the sessions, it is enough if "share" propagation is > > > set up between the different namespaces of a user. > > > > > > I don't yet see either in your or Viro's description how the trees > > > under /share/$USER are initialized. I guess they are recursively > > > bound from /, and are made slaves. > > > > yes. I suppose, when a userid is created one of the steps would be > > > > mount --rbind / /share/$USER > > mount --make-rslave /share/$USER > > mount --make-rshared /share/$USER > > Thinking a bit more about this, I'm quite sure most users wouldn't > even want private namespaces. It would be enough to > > chroot /share/$USER > > and be done with it. > > Private namespaces are only good for keeping a bunch of mounts > referenced by a group of processes. But my guess is, that the natural > behavior for users is to see a persistent set of mounts. > > If for example they mount something on a remote machine, then log out > from the ssh session and later log back in, they would want to see > their previous mount still there. They will continue see their previous mount tree. Even if all the namespaces belonging to the different sessions of the user get dismantled when all the sessions exit, the a mirror of those mount trees continue to exist under /share/$USER in the original namespace. So I don't think we have a issue. NOTE: when I say 'original namespace' I mean the admin namespace; the first namespace that gets created when the machine boots. RP > > Miklos ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2007-04-16 15:57 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20070404183012.429274832@szeredi.hu>
2007-04-06 23:02 ` [patch 0/8] unprivileged mount syscall Andrew Morton
2007-04-06 23:16 ` H. Peter Anvin
2007-04-06 23:55 ` Jan Engelhardt
2007-04-07 0:22 ` H. Peter Anvin
2007-04-07 3:40 ` Eric Van Hensbergen
2007-04-07 6:48 ` Miklos Szeredi
2007-04-10 8:52 ` Ian Kent
2007-04-11 10:48 ` Miklos Szeredi
2007-04-11 13:48 ` Ian Kent
2007-04-11 14:26 ` Serge E. Hallyn
2007-04-11 14:27 ` Ian Kent
2007-04-11 14:45 ` Serge E. Hallyn
2007-04-07 6:41 ` Miklos Szeredi
2007-04-09 14:38 ` Serge E. Hallyn
2007-04-09 16:24 ` Miklos Szeredi
2007-04-09 17:07 ` Serge E. Hallyn
2007-04-09 17:46 ` Ram Pai
2007-04-09 18:25 ` H. Peter Anvin
2007-04-10 10:33 ` Karel Zak
2007-04-09 20:10 ` Miklos Szeredi
2007-04-10 8:38 ` Ram Pai
2007-04-11 10:44 ` Miklos Szeredi
2007-04-11 18:28 ` Ram Pai
2007-04-13 11:58 ` Miklos Szeredi
2007-04-13 13:28 ` Serge E. Hallyn
2007-04-13 14:05 ` Miklos Szeredi
2007-04-13 21:44 ` Serge E. Hallyn
2007-04-15 20:39 ` Miklos Szeredi
2007-04-16 1:11 ` Serge E. Hallyn
2007-04-16 8:18 ` Ram Pai
2007-04-16 9:27 ` Miklos Szeredi
2007-04-16 15:40 ` Eric W. Biederman
2007-04-16 15:55 ` Miklos Szeredi
2007-04-13 20:07 ` Karel Zak
2007-04-15 20:21 ` Miklos Szeredi
2007-04-16 7:59 ` Ram Pai
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox