* [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access
@ 2011-11-17 10:04 Cyrill Gorcunov
2011-11-17 15:41 ` Serge E. Hallyn
0 siblings, 1 reply; 6+ messages in thread
From: Cyrill Gorcunov @ 2011-11-17 10:04 UTC (permalink / raw)
To: Andrew Morton, Tejun Heo, Pavel Emelyanov, Vasiliy Kulikov,
Serge E. Hallyn
Cc: LKML
The goal idea of checkpoint/restore is to provide this feature not
for admins only but regular users as well. Still some operations
are privileged -- such as accessing /proc/$pid/map_files.
So instead of requiring anyone who has a will to checkpoint/restore
processes CAP_SYS_ADMIN privileges, it might (?) be worth to bring a way
less powerful CAP_CHECKPOINT capability.
The following permissions for CAP_CHECKPOINT should be granted
- read/write /proc/$pid/map_files/
- (not yet merged) clone-with-specified-pid, might be changed to last_pid+clone setup
- (not yet published/stabilized) prctls calls to tune up vDSO and elements
of mm_struct such as mm->start_code, mm->end_code, mm->start_data and etc
I would like to gather people opinions on such approach as a general.
_ANY_ comments are highly appreciated. Would it worth it or not (since
CAPs space is pretty limited one).
(the patch is on top of -mm)
*NOT-FOR-INCLUSION*
---
fs/proc/base.c | 6 ++++--
include/linux/capability.h | 7 ++++++-
2 files changed, 10 insertions(+), 3 deletions(-)
Index: linux-2.6.git/fs/proc/base.c
===================================================================
--- linux-2.6.git.orig/fs/proc/base.c
+++ linux-2.6.git/fs/proc/base.c
@@ -2386,7 +2386,8 @@ static struct dentry *proc_map_files_loo
struct mm_struct *mm;
result = ERR_PTR(-EACCES);
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_SYS_ADMIN) &&
+ !capable(CAP_CHECKPOINT))
goto out;
result = ERR_PTR(-ENOENT);
@@ -2442,7 +2443,8 @@ proc_map_files_readdir(struct file *filp
int ret;
ret = -EACCES;
- if (!capable(CAP_SYS_ADMIN))
+ if (!capable(CAP_SYS_ADMIN) &&
+ !capable(CAP_CHECKPOINT))
goto out;
ret = -ENOENT;
Index: linux-2.6.git/include/linux/capability.h
===================================================================
--- linux-2.6.git.orig/include/linux/capability.h
+++ linux-2.6.git/include/linux/capability.h
@@ -360,8 +360,13 @@ struct cpu_vfs_cap_data {
#define CAP_WAKE_ALARM 35
+/*
+ * Allow to use privilege operations needed for
+ * checkpoint/restore procedure.
+ */
+#define CAP_CHECKPOINT 36
-#define CAP_LAST_CAP CAP_WAKE_ALARM
+#define CAP_LAST_CAP CAP_CHECKPOINT
#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access
2011-11-17 10:04 [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access Cyrill Gorcunov
@ 2011-11-17 15:41 ` Serge E. Hallyn
2011-11-17 16:24 ` Cyrill Gorcunov
2011-11-17 20:54 ` Andrew Morton
0 siblings, 2 replies; 6+ messages in thread
From: Serge E. Hallyn @ 2011-11-17 15:41 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: Andrew Morton, Tejun Heo, Pavel Emelyanov, Vasiliy Kulikov, LKML
Quoting Cyrill Gorcunov (gorcunov@gmail.com):
> The goal idea of checkpoint/restore is to provide this feature not
> for admins only but regular users as well. Still some operations
> are privileged -- such as accessing /proc/$pid/map_files.
>
> So instead of requiring anyone who has a will to checkpoint/restore
> processes CAP_SYS_ADMIN privileges, it might (?) be worth to bring a way
> less powerful CAP_CHECKPOINT capability.
>
> The following permissions for CAP_CHECKPOINT should be granted
> - read/write /proc/$pid/map_files/
read/write to all map files, or only pids he owns?
I think a CAP_CHECKPOINT may make sense, but not if includes read/write
to all map files. That's too much power, and you may as well just hand
him everything. But, CAP_CHECKPOINT shouldn't need to include that. You
should be able to get that for instance by being the creator of the user
namespace being checkpointed. If you really want to checkpoint/restart
anything on the system, then you should be required to be root. Trying
to easily hand that power to an unprivileged user is more dangerous imo.
> - (not yet merged) clone-with-specified-pid, might be changed to last_pid+clone setup
> - (not yet published/stabilized) prctls calls to tune up vDSO and elements
> of mm_struct such as mm->start_code, mm->end_code, mm->start_data and etc
>
> I would like to gather people opinions on such approach as a general.
> _ANY_ comments are highly appreciated. Would it worth it or not (since
> CAPs space is pretty limited one).
It's hard to have a specific dialogue without the full c/r patchset and
idea of the architecture of the exploiters (ie c/r and maybe
debuggers)
Sorry, the security implications of the in-kernel c/r syscalls were
pretty simple and clear to me, but those of the new approach are not.
-serge
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access
2011-11-17 15:41 ` Serge E. Hallyn
@ 2011-11-17 16:24 ` Cyrill Gorcunov
2011-11-17 20:54 ` Andrew Morton
1 sibling, 0 replies; 6+ messages in thread
From: Cyrill Gorcunov @ 2011-11-17 16:24 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Andrew Morton, Tejun Heo, Pavel Emelyanov, Vasiliy Kulikov, LKML
On Thu, Nov 17, 2011 at 09:41:05AM -0600, Serge E. Hallyn wrote:
> Quoting Cyrill Gorcunov (gorcunov@gmail.com):
> > The goal idea of checkpoint/restore is to provide this feature not
> > for admins only but regular users as well. Still some operations
> > are privileged -- such as accessing /proc/$pid/map_files.
> >
> > So instead of requiring anyone who has a will to checkpoint/restore
> > processes CAP_SYS_ADMIN privileges, it might (?) be worth to bring a way
> > less powerful CAP_CHECKPOINT capability.
> >
> > The following permissions for CAP_CHECKPOINT should be granted
> > - read/write /proc/$pid/map_files/
>
> read/write to all map files, or only pids he owns?
>
There is lock_trace() call which should prevent from accessing non-own
map_files (if only CAP_SYS_PTRACE is not granted).
> I think a CAP_CHECKPOINT may make sense, but not if includes read/write
> to all map files. That's too much power, and you may as well just hand
> him everything. But, CAP_CHECKPOINT shouldn't need to include that. You
> should be able to get that for instance by being the creator of the user
> namespace being checkpointed. If you really want to checkpoint/restart
> anything on the system, then you should be required to be root. Trying
> to easily hand that power to an unprivileged user is more dangerous imo.
>
> > - (not yet merged) clone-with-specified-pid, might be changed to last_pid+clone setup
> > - (not yet published/stabilized) prctls calls to tune up vDSO and elements
> > of mm_struct such as mm->start_code, mm->end_code, mm->start_data and etc
> >
> > I would like to gather people opinions on such approach as a general.
> > _ANY_ comments are highly appreciated. Would it worth it or not (since
> > CAPs space is pretty limited one).
>
> It's hard to have a specific dialogue without the full c/r patchset and
> idea of the architecture of the exploiters (ie c/r and maybe
> debuggers)
The patches for kernel (which needed at moment) are placed at
http://goo.gl/DwYHx , I didn't pushed them for reivew yet since
they are not well stable and need some rework, so you could take
a look if you're interested. Feedback is appreciated as always,
but I warned you, they are not yet for inclusion ;)
>
> Sorry, the security implications of the in-kernel c/r syscalls were
> pretty simple and clear to me, but those of the new approach are not.
>
> -serge
Thanks a lot for taking a view, Serge!
Cyrill
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access
2011-11-17 15:41 ` Serge E. Hallyn
2011-11-17 16:24 ` Cyrill Gorcunov
@ 2011-11-17 20:54 ` Andrew Morton
2011-11-17 21:07 ` Serge E. Hallyn
2011-11-17 21:31 ` Cyrill Gorcunov
1 sibling, 2 replies; 6+ messages in thread
From: Andrew Morton @ 2011-11-17 20:54 UTC (permalink / raw)
To: Serge E. Hallyn
Cc: Cyrill Gorcunov, Tejun Heo, Pavel Emelyanov, Vasiliy Kulikov,
LKML
On Thu, 17 Nov 2011 09:41:05 -0600
"Serge E. Hallyn" <serge.hallyn@canonical.com> wrote:
> > - (not yet merged) clone-with-specified-pid, might be changed to last_pid+clone setup
> > - (not yet published/stabilized) prctls calls to tune up vDSO and elements
> > of mm_struct such as mm->start_code, mm->end_code, mm->start_data and etc
> >
> > I would like to gather people opinions on such approach as a general.
> > _ANY_ comments are highly appreciated. Would it worth it or not (since
> > CAPs space is pretty limited one).
>
> It's hard to have a specific dialogue without the full c/r patchset and
> idea of the architecture of the exploiters (ie c/r and maybe
> debuggers)
>
> Sorry, the security implications of the in-kernel c/r syscalls were
> pretty simple and clear to me, but those of the new approach are not.
yup.
>From a development-order perspective perhaps it is better to get
everything working and stabilized for root first. Then as a separate
activity start working on making it available to less-privileged users.
We would need to be confident that such a second development effort
doesn't cause back-compatibility issues (ie: interface changes) for
existing root users.
Is it possible that once everything is working for root, we realise
that we can get it all working for non-root users via suitable setuid
userspace tools?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access
2011-11-17 20:54 ` Andrew Morton
@ 2011-11-17 21:07 ` Serge E. Hallyn
2011-11-17 21:31 ` Cyrill Gorcunov
1 sibling, 0 replies; 6+ messages in thread
From: Serge E. Hallyn @ 2011-11-17 21:07 UTC (permalink / raw)
To: Andrew Morton
Cc: Cyrill Gorcunov, Tejun Heo, Pavel Emelyanov, Vasiliy Kulikov,
LKML
Quoting Andrew Morton (akpm@linux-foundation.org):
> On Thu, 17 Nov 2011 09:41:05 -0600
> "Serge E. Hallyn" <serge.hallyn@canonical.com> wrote:
>
> > > - (not yet merged) clone-with-specified-pid, might be changed to last_pid+clone setup
> > > - (not yet published/stabilized) prctls calls to tune up vDSO and elements
> > > of mm_struct such as mm->start_code, mm->end_code, mm->start_data and etc
> > >
> > > I would like to gather people opinions on such approach as a general.
> > > _ANY_ comments are highly appreciated. Would it worth it or not (since
> > > CAPs space is pretty limited one).
> >
> > It's hard to have a specific dialogue without the full c/r patchset and
> > idea of the architecture of the exploiters (ie c/r and maybe
> > debuggers)
> >
> > Sorry, the security implications of the in-kernel c/r syscalls were
> > pretty simple and clear to me, but those of the new approach are not.
>
> yup.
>
> From a development-order perspective perhaps it is better to get
> everything working and stabilized for root first. Then as a separate
> activity start working on making it available to less-privileged users.
>
> We would need to be confident that such a second development effort
> doesn't cause back-compatibility issues (ie: interface changes) for
> existing root users.
>
>
>
> Is it possible that once everything is working for root, we realise
> that we can get it all working for non-root users via suitable setuid
> userspace tools?
Not only that, I think it's possible that by the time all the needed c/r
pieces are in, user namespaces will be as well, as will unprivileged
namespace cloning (at least when done along with CLONE_NEWUSER). In that
case, it should be possible to do c/r of a container with no privileges
on the host (but full privileges to the user namespace of the container).
-serge
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access
2011-11-17 20:54 ` Andrew Morton
2011-11-17 21:07 ` Serge E. Hallyn
@ 2011-11-17 21:31 ` Cyrill Gorcunov
1 sibling, 0 replies; 6+ messages in thread
From: Cyrill Gorcunov @ 2011-11-17 21:31 UTC (permalink / raw)
To: Andrew Morton
Cc: Serge E. Hallyn, Tejun Heo, Pavel Emelyanov, Vasiliy Kulikov,
LKML
On Thu, Nov 17, 2011 at 12:54:14PM -0800, Andrew Morton wrote:
...
> >
> > It's hard to have a specific dialogue without the full c/r patchset and
> > idea of the architecture of the exploiters (ie c/r and maybe
> > debuggers)
> >
> > Sorry, the security implications of the in-kernel c/r syscalls were
> > pretty simple and clear to me, but those of the new approach are not.
>
> yup.
>
> From a development-order perspective perhaps it is better to get
> everything working and stabilized for root first. Then as a separate
> activity start working on making it available to less-privileged users.
>
> We would need to be confident that such a second development effort
> doesn't cause back-compatibility issues (ie: interface changes) for
> existing root users.
>
> Is it possible that once everything is working for root, we realise
> that we can get it all working for non-root users via suitable setuid
> userspace tools?
Once it operates well under root (actually I'm testing it under kvm with
root account) I believe tuning code up for non-root users should be possible
too. At moment I need cap-sys-admin only because of map_files/ but technically
I barely need ptrace over dumping task(s) and access to map_files.
Cyrill
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-11-17 21:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-17 10:04 [RFC] Introduce CAP_CHECKPOINT capability and filter map_files/ access Cyrill Gorcunov
2011-11-17 15:41 ` Serge E. Hallyn
2011-11-17 16:24 ` Cyrill Gorcunov
2011-11-17 20:54 ` Andrew Morton
2011-11-17 21:07 ` Serge E. Hallyn
2011-11-17 21:31 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).