From: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
To: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Cc: Serge Hallyn
<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
Linux Containers
<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
Andrei Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>,
LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
"Eric W. Biederman"
<ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Subject: Re: [PATCH 1/2] kernel: show current values of user namespace counters
Date: Tue, 16 Aug 2016 15:05:29 -0500 [thread overview]
Message-ID: <20160816200529.GA1280@mail.hallyn.com> (raw)
In-Reply-To: <CAGXu5jK0vt9WAaX4a5ihV+iVGFSnn0JJOKAVXocg2VqMV+rxBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Quoting Kees Cook (keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org):
> On Mon, Aug 15, 2016 at 1:10 PM, Andrei Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org> wrote:
> > Recently Eric added user namespace counters. User namespace counters is
> > a feature that allows to limit the number of various kernel objects a
> > user can create. These limits are set via /proc/sys/user/ sysctls on a
> > per user namespace basis and are applicable to all users in that
> > namespace.
> >
> > This patch adds /proc/PID/userns_counts files to provide current usage
> > of user namespace counters.
> >
> > > cat /proc/813/userns_counts
> > user_namespaces 101000 1
> > pid_namespaces 101000 1
> > ipc_namespaces 101000 4
> > net_namespaces 101000 2
> > mnt_namespaces 101000 5
> > mnt_namespaces 100000 1
> >
> > The meanings of the columns are as follows, from left to right:
> >
> > Name Object name
> > UID User ID
> > Usage Current usage
> >
> > Cc: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> > Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> > Cc: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> > Signed-off-by: Andrei Vagin <avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
> > ---
> > fs/proc/array.c | 57 +++++++++++++++++++++++
> > fs/proc/base.c | 3 ++
> > fs/proc/internal.h | 1 +
> > include/linux/user_namespace.h | 8 ++++
> > kernel/ucount.c | 102 +++++++++++++++++++++++++++++++++++++++++
> > 5 files changed, 171 insertions(+)
> >
> > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > index 88c7de1..f186625 100644
> > --- a/fs/proc/array.c
> > +++ b/fs/proc/array.c
> > @@ -734,3 +734,60 @@ const struct file_operations proc_tid_children_operations = {
> > .release = children_seq_release,
> > };
> > #endif /* CONFIG_PROC_CHILDREN */
> > +
> > +#ifdef CONFIG_USER_NS
> > +static int ucounts_open(struct inode *inode, struct file *filp)
> > +{
> > + struct ucounts_iterator *iter;
> > + struct seq_file *seq;
> > + int ret;
> > +
> > + struct task_struct *task;
> > + struct user_namespace *ns;
> > +
> > + task = get_proc_task(inode);
> > + if (!task)
> > + return -ESRCH;
> > +
> > + rcu_read_lock();
> > + ns = get_user_ns(__task_cred(task)->user_ns);
> > + rcu_read_unlock();
> > +
> > + put_task_struct(task);
> > +
> > + if (ns == NULL)
> > + return -ESRCH;
> > +
> > + ret = seq_open_private(filp, &ucounts_seq_operations,
> > + sizeof(struct ucounts_iterator));
> > +
> > + if (ret) {
> > + put_user_ns(ns);
> > + return ret;
> > + }
> > +
> > + seq = filp->private_data;
> > + iter = seq->private;
> > + iter->ns = ns;
> > +
> > + return 0;
> > +}
> > +
> > +int ucounts_release(struct inode *inode, struct file *file)
> > +{
> > + struct seq_file *seq = file->private_data;
> > + struct ucounts_iterator *iter = seq->private;
> > +
> > + put_user_ns(iter->ns);
> > +
> > + return seq_release_private(inode, file);
> > +}
> > +
> > +
> > +const struct file_operations proc_ucounts_operations = {
> > + .open = ucounts_open,
> > + .read = seq_read,
> > + .llseek = seq_lseek,
> > + .release = ucounts_release,
> > +};
> > +#endif /* CONFIG_USER_NS */
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 54e2702..4252f7a 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2910,6 +2910,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> > REG("timers", S_IRUGO, proc_timers_operations),
> > #endif
> > REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
> > +#ifdef CONFIG_USER_NS
> > + REG("userns_counts", S_IRUGO, proc_ucounts_operations),
> > +#endif
> > };
> >
> > static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
> > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > index 7931c55..845cadb 100644
> > --- a/fs/proc/internal.h
> > +++ b/fs/proc/internal.h
> > @@ -298,6 +298,7 @@ extern const struct file_operations proc_pid_smaps_operations;
> > extern const struct file_operations proc_tid_smaps_operations;
> > extern const struct file_operations proc_clear_refs_operations;
> > extern const struct file_operations proc_pagemap_operations;
> > +extern const struct file_operations proc_ucounts_operations;
> >
> > extern unsigned long task_vsize(struct mm_struct *);
> > extern unsigned long task_statm(struct mm_struct *,
> > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> > index 30ffe10..5f824dd 100644
> > --- a/include/linux/user_namespace.h
> > +++ b/include/linux/user_namespace.h
> > @@ -106,6 +106,14 @@ extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t,
> > extern int proc_setgroups_show(struct seq_file *m, void *v);
> > extern bool userns_may_setgroups(const struct user_namespace *ns);
> > extern bool current_in_userns(const struct user_namespace *target_ns);
> > +
> > +struct ucounts_iterator {
> > + struct user_namespace *ns;
> > + int hash;
> > +};
> > +
> > +extern const struct seq_operations ucounts_seq_operations;
> > +
> > #else
> >
> > static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> > diff --git a/kernel/ucount.c b/kernel/ucount.c
> > index 6ebbe9b..cef09e3 100644
> > --- a/kernel/ucount.c
> > +++ b/kernel/ucount.c
> > @@ -9,6 +9,8 @@
> > #include <linux/sysctl.h>
> > #include <linux/slab.h>
> > #include <linux/hash.h>
> > +#include <linux/seq_file.h>
> > +#include <linux/proc_fs.h>
> > #include <linux/user_namespace.h>
> >
> > #define UCOUNTS_HASHTABLE_BITS 10
> > @@ -232,4 +234,104 @@ static __init int user_namespace_sysctl_init(void)
> > }
> > subsys_initcall(user_namespace_sysctl_init);
> >
> > +#ifdef CONFIG_PROC_FS
> > +static void *ucounts_start(struct seq_file *f, loff_t *pos)
> > +{
> > + struct ucounts_iterator *iter = f->private;
> > + int h, i = 0;
> > +
> > + spin_lock(&ucounts_lock);
>
> This series is much improved, thanks! However, I still don't think
> it's a good idea to hold this spinlock across the start/stop lifetime.
> It creates too many opportunities for abuse. :( Perhaps Eric will have
> some better ideas about how to deal with this...
Would it be too limiting to pre-allocate a page and simply fill it out
right at ucounts_start, so that we can drop the spinlock and simply
return data from that page?
WARNING: multiple messages have this Message-ID (diff)
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Kees Cook <keescook@chromium.org>
Cc: Andrei Vagin <avagin@openvz.org>,
Serge Hallyn <serge.hallyn@canonical.com>,
Linux Containers <containers@lists.linux-foundation.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] kernel: show current values of user namespace counters
Date: Tue, 16 Aug 2016 15:05:29 -0500 [thread overview]
Message-ID: <20160816200529.GA1280@mail.hallyn.com> (raw)
In-Reply-To: <CAGXu5jK0vt9WAaX4a5ihV+iVGFSnn0JJOKAVXocg2VqMV+rxBg@mail.gmail.com>
Quoting Kees Cook (keescook@chromium.org):
> On Mon, Aug 15, 2016 at 1:10 PM, Andrei Vagin <avagin@openvz.org> wrote:
> > Recently Eric added user namespace counters. User namespace counters is
> > a feature that allows to limit the number of various kernel objects a
> > user can create. These limits are set via /proc/sys/user/ sysctls on a
> > per user namespace basis and are applicable to all users in that
> > namespace.
> >
> > This patch adds /proc/PID/userns_counts files to provide current usage
> > of user namespace counters.
> >
> > > cat /proc/813/userns_counts
> > user_namespaces 101000 1
> > pid_namespaces 101000 1
> > ipc_namespaces 101000 4
> > net_namespaces 101000 2
> > mnt_namespaces 101000 5
> > mnt_namespaces 100000 1
> >
> > The meanings of the columns are as follows, from left to right:
> >
> > Name Object name
> > UID User ID
> > Usage Current usage
> >
> > Cc: Serge Hallyn <serge.hallyn@canonical.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> > Signed-off-by: Andrei Vagin <avagin@openvz.org>
> > ---
> > fs/proc/array.c | 57 +++++++++++++++++++++++
> > fs/proc/base.c | 3 ++
> > fs/proc/internal.h | 1 +
> > include/linux/user_namespace.h | 8 ++++
> > kernel/ucount.c | 102 +++++++++++++++++++++++++++++++++++++++++
> > 5 files changed, 171 insertions(+)
> >
> > diff --git a/fs/proc/array.c b/fs/proc/array.c
> > index 88c7de1..f186625 100644
> > --- a/fs/proc/array.c
> > +++ b/fs/proc/array.c
> > @@ -734,3 +734,60 @@ const struct file_operations proc_tid_children_operations = {
> > .release = children_seq_release,
> > };
> > #endif /* CONFIG_PROC_CHILDREN */
> > +
> > +#ifdef CONFIG_USER_NS
> > +static int ucounts_open(struct inode *inode, struct file *filp)
> > +{
> > + struct ucounts_iterator *iter;
> > + struct seq_file *seq;
> > + int ret;
> > +
> > + struct task_struct *task;
> > + struct user_namespace *ns;
> > +
> > + task = get_proc_task(inode);
> > + if (!task)
> > + return -ESRCH;
> > +
> > + rcu_read_lock();
> > + ns = get_user_ns(__task_cred(task)->user_ns);
> > + rcu_read_unlock();
> > +
> > + put_task_struct(task);
> > +
> > + if (ns == NULL)
> > + return -ESRCH;
> > +
> > + ret = seq_open_private(filp, &ucounts_seq_operations,
> > + sizeof(struct ucounts_iterator));
> > +
> > + if (ret) {
> > + put_user_ns(ns);
> > + return ret;
> > + }
> > +
> > + seq = filp->private_data;
> > + iter = seq->private;
> > + iter->ns = ns;
> > +
> > + return 0;
> > +}
> > +
> > +int ucounts_release(struct inode *inode, struct file *file)
> > +{
> > + struct seq_file *seq = file->private_data;
> > + struct ucounts_iterator *iter = seq->private;
> > +
> > + put_user_ns(iter->ns);
> > +
> > + return seq_release_private(inode, file);
> > +}
> > +
> > +
> > +const struct file_operations proc_ucounts_operations = {
> > + .open = ucounts_open,
> > + .read = seq_read,
> > + .llseek = seq_lseek,
> > + .release = ucounts_release,
> > +};
> > +#endif /* CONFIG_USER_NS */
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 54e2702..4252f7a 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -2910,6 +2910,9 @@ static const struct pid_entry tgid_base_stuff[] = {
> > REG("timers", S_IRUGO, proc_timers_operations),
> > #endif
> > REG("timerslack_ns", S_IRUGO|S_IWUGO, proc_pid_set_timerslack_ns_operations),
> > +#ifdef CONFIG_USER_NS
> > + REG("userns_counts", S_IRUGO, proc_ucounts_operations),
> > +#endif
> > };
> >
> > static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
> > diff --git a/fs/proc/internal.h b/fs/proc/internal.h
> > index 7931c55..845cadb 100644
> > --- a/fs/proc/internal.h
> > +++ b/fs/proc/internal.h
> > @@ -298,6 +298,7 @@ extern const struct file_operations proc_pid_smaps_operations;
> > extern const struct file_operations proc_tid_smaps_operations;
> > extern const struct file_operations proc_clear_refs_operations;
> > extern const struct file_operations proc_pagemap_operations;
> > +extern const struct file_operations proc_ucounts_operations;
> >
> > extern unsigned long task_vsize(struct mm_struct *);
> > extern unsigned long task_statm(struct mm_struct *,
> > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> > index 30ffe10..5f824dd 100644
> > --- a/include/linux/user_namespace.h
> > +++ b/include/linux/user_namespace.h
> > @@ -106,6 +106,14 @@ extern ssize_t proc_setgroups_write(struct file *, const char __user *, size_t,
> > extern int proc_setgroups_show(struct seq_file *m, void *v);
> > extern bool userns_may_setgroups(const struct user_namespace *ns);
> > extern bool current_in_userns(const struct user_namespace *target_ns);
> > +
> > +struct ucounts_iterator {
> > + struct user_namespace *ns;
> > + int hash;
> > +};
> > +
> > +extern const struct seq_operations ucounts_seq_operations;
> > +
> > #else
> >
> > static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> > diff --git a/kernel/ucount.c b/kernel/ucount.c
> > index 6ebbe9b..cef09e3 100644
> > --- a/kernel/ucount.c
> > +++ b/kernel/ucount.c
> > @@ -9,6 +9,8 @@
> > #include <linux/sysctl.h>
> > #include <linux/slab.h>
> > #include <linux/hash.h>
> > +#include <linux/seq_file.h>
> > +#include <linux/proc_fs.h>
> > #include <linux/user_namespace.h>
> >
> > #define UCOUNTS_HASHTABLE_BITS 10
> > @@ -232,4 +234,104 @@ static __init int user_namespace_sysctl_init(void)
> > }
> > subsys_initcall(user_namespace_sysctl_init);
> >
> > +#ifdef CONFIG_PROC_FS
> > +static void *ucounts_start(struct seq_file *f, loff_t *pos)
> > +{
> > + struct ucounts_iterator *iter = f->private;
> > + int h, i = 0;
> > +
> > + spin_lock(&ucounts_lock);
>
> This series is much improved, thanks! However, I still don't think
> it's a good idea to hold this spinlock across the start/stop lifetime.
> It creates too many opportunities for abuse. :( Perhaps Eric will have
> some better ideas about how to deal with this...
Would it be too limiting to pre-allocate a page and simply fill it out
right at ucounts_start, so that we can drop the spinlock and simply
return data from that page?
next prev parent reply other threads:[~2016-08-16 20:05 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-15 20:10 [PATCH 0/2 v2] userns: show current values of user namespace counters Andrei Vagin
2016-08-15 20:10 ` Andrei Vagin
[not found] ` <1471291822-539-1-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-08-15 20:10 ` [PATCH 1/2] kernel: " Andrei Vagin
2016-08-15 20:10 ` Andrei Vagin
[not found] ` <1471291822-539-2-git-send-email-avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2016-08-16 20:00 ` Kees Cook
2016-08-16 20:00 ` Kees Cook
[not found] ` <CAGXu5jK0vt9WAaX4a5ihV+iVGFSnn0JJOKAVXocg2VqMV+rxBg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-08-16 20:05 ` Serge E. Hallyn [this message]
2016-08-16 20:05 ` Serge E. Hallyn
[not found] ` <20160816200529.GA1280-7LNsyQBKDXoIagZqoN9o3w@public.gmane.org>
2016-08-16 22:44 ` Andrei Vagin
2016-08-16 22:44 ` Andrei Vagin
2016-08-15 20:10 ` [PATCH 2/2] Documentation: describe /proc/<pid>/userns_counts Andrei Vagin
2016-08-15 20:10 ` Andrei Vagin
2016-08-16 22:53 ` [PATCH 0/2 v2] userns: show current values of user namespace counters Serge E. Hallyn
2016-08-16 22:53 ` Serge E. Hallyn
2016-10-06 17:51 ` Andrei Vagin
2016-10-06 17:51 ` Andrei Vagin
[not found] ` <20161006175146.GA25935-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-06 19:33 ` Eric W. Biederman
2016-10-06 19:33 ` Eric W. Biederman
[not found] ` <87wphlclwe.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-10-10 16:22 ` Andrei Vagin
2016-10-10 16:22 ` Andrei Vagin
[not found] ` <20161010162202.GA31628-1ViLX0X+lBJGNQ1M2rI3KwRV3xvJKrda@public.gmane.org>
2016-10-10 20:44 ` Eric W. Biederman
2016-10-10 20:44 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160816200529.GA1280@mail.hallyn.com \
--to=serge-a9i7lubdfnhqt0dzr+alfa@public.gmane.org \
--cc=avagin-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.