* [PATCH RFC] audit: provide namespace information in user originated records
@ 2013-03-18 15:45 Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 1/8] mntns: introduce mntns_get_inum() Aristeu Rozanski
` (7 more replies)
0 siblings, 8 replies; 24+ messages in thread
From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw)
To: linux-audit
(re-sending this, linux-audit is members only it seems)
This patchset introduces a new audit record to follow all USER records which
provides namespace information of the process. The idea is to allow processes
in containers to create records in the host system while providing means to be
filtered out.
For each new namespace, a unique procfs inode number is allocated and this
number has been used by userspace to determine which processes belong to the
same namespace. These numbers are used in the new audit record.
Applications such as libvirt-sandbox and lxc can then report the same numbers
when a container is created and destroyed allowing to map records to a certain
container. Maybe the next step would be having a record for whenever a new
namespace is created?
First 6 patches are needed in order to get each namespace's inode number.
Patch 7 properly defines the new record that is related to the USER record
Patch 8 allows USER records to be generated from namespaces
Here's an example of output:
type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success'
type=UNKNOWN[1327] msg=audit(1363528861.403:311): mnt=4026531840 net=4026531956 uts=4026531838 ipc=4026531839 pid=4026531836 user=4026531837
Notes:
- this is a RFC, all sorts of feedback are much appreciated
- while the last patch allows a new userns to send audit records, I haven't
look yet on making sure it has proper capabilities so regular users'
containers can create records
- the record number allocated is just a draft. If this patchset evolves into
something that can be merged, please advise which number number is the best
choice
fs/namespace.c | 14 +++++++
include/linux/ipc_namespace.h | 1
include/linux/mnt_namespace.h | 2 +
include/linux/pid_namespace.h | 1
include/linux/user_namespace.h | 1
include/linux/utsname.h | 1
include/net/net_namespace.h | 1
include/uapi/linux/audit.h | 1
ipc/namespace.c | 14 +++++++
kernel/audit.c | 76 +++++++++++++++++++++++++++++++++++++----
kernel/pid_namespace.c | 11 +++++
kernel/user_namespace.c | 5 ++
kernel/utsname.c | 14 +++++++
net/core/net_namespace.c | 14 +++++++
14 files changed, 150 insertions(+), 6 deletions(-)
^ permalink raw reply [flat|nested] 24+ messages in thread* [PATCH RFC 1/8] mntns: introduce mntns_get_inum() 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 2/8] ipcns: introduce ipcns_get_inum() Aristeu Rozanski ` (6 subsequent siblings) 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit This allows other parts of the kernel to have access to userspace visible namespace identification. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- fs/namespace.c | 14 ++++++++++++++ include/linux/mnt_namespace.h | 2 ++ 2 files changed, 16 insertions(+), 0 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 50ca17d..b8a888f 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2792,6 +2792,20 @@ static unsigned int mntns_inum(void *ns) return mnt_ns->proc_inum; } +unsigned int mntns_get_inum(struct task_struct *tsk) +{ + struct nsproxy *nsproxy; + int rc = 0; + + rcu_read_lock(); + nsproxy = task_nsproxy(tsk); + if (nsproxy) + rc = mntns_inum(nsproxy->mnt_ns); + rcu_read_unlock(); + + return rc; +} + const struct proc_ns_operations mntns_operations = { .name = "mnt", .type = CLONE_NEWNS, diff --git a/include/linux/mnt_namespace.h b/include/linux/mnt_namespace.h index 12b2ab5..b6afe65 100644 --- a/include/linux/mnt_namespace.h +++ b/include/linux/mnt_namespace.h @@ -5,10 +5,12 @@ struct mnt_namespace; struct fs_struct; struct user_namespace; +struct task_struct; extern struct mnt_namespace *copy_mnt_ns(unsigned long, struct mnt_namespace *, struct user_namespace *, struct fs_struct *); extern void put_mnt_ns(struct mnt_namespace *ns); +extern unsigned int mntns_get_inum(struct task_struct *tsk); extern const struct file_operations proc_mounts_operations; extern const struct file_operations proc_mountinfo_operations; -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 2/8] ipcns: introduce ipcns_get_inum() 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 1/8] mntns: introduce mntns_get_inum() Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 3/8] pidns: introduce pidns_get_inum() Aristeu Rozanski ` (5 subsequent siblings) 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit This allows other parts of the kernel to have access to userspace visible namespace identification. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/linux/ipc_namespace.h | 1 + ipc/namespace.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 0 deletions(-) diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h index ae221a7..f9fb114 100644 --- a/include/linux/ipc_namespace.h +++ b/include/linux/ipc_namespace.h @@ -146,6 +146,7 @@ static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns) } extern void put_ipc_ns(struct ipc_namespace *ns); +extern unsigned int ipcns_get_inum(struct task_struct *tsk); #else static inline struct ipc_namespace *copy_ipcs(unsigned long flags, struct user_namespace *user_ns, struct ipc_namespace *ns) diff --git a/ipc/namespace.c b/ipc/namespace.c index 7c1fa45..4615db5 100644 --- a/ipc/namespace.c +++ b/ipc/namespace.c @@ -188,6 +188,20 @@ static unsigned int ipcns_inum(void *vp) return ns->proc_inum; } +unsigned int ipcns_get_inum(struct task_struct *tsk) +{ + struct nsproxy *nsproxy; + unsigned int rc = 0; + + rcu_read_lock(); + nsproxy = task_nsproxy(tsk); + if (nsproxy) + rc = ipcns_inum(nsproxy->ipc_ns); + rcu_read_unlock(); + + return rc; +} + const struct proc_ns_operations ipcns_operations = { .name = "ipc", .type = CLONE_NEWIPC, -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 3/8] pidns: introduce pidns_get_inum() 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 1/8] mntns: introduce mntns_get_inum() Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 2/8] ipcns: introduce ipcns_get_inum() Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 4/8] userns: introduce userns_get_inum() Aristeu Rozanski ` (4 subsequent siblings) 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit This allows other parts of the kernel to have access to userspace visible namespace identification. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/linux/pid_namespace.h | 1 + kernel/pid_namespace.c | 11 +++++++++++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 215e5e3..8223654 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -57,6 +57,7 @@ extern struct pid_namespace *copy_pid_ns(unsigned long flags, extern void zap_pid_ns_processes(struct pid_namespace *pid_ns); extern int reboot_pid_ns(struct pid_namespace *pid_ns, int cmd); extern void put_pid_ns(struct pid_namespace *ns); +extern unsigned int pidns_get_inum(struct task_struct *tsk); #else /* !CONFIG_PID_NS */ #include <linux/err.h> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index c1c3dc1..5e463ff 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -361,6 +361,17 @@ static unsigned int pidns_inum(void *ns) return pid_ns->proc_inum; } +unsigned int pidns_get_inum(struct task_struct *tsk) +{ + unsigned int rc; + + rcu_read_lock(); + rc = pidns_inum(task_active_pid_ns(tsk)); + rcu_read_unlock(); + + return rc; +} + const struct proc_ns_operations pidns_operations = { .name = "pid", .type = CLONE_NEWPID, -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 4/8] userns: introduce userns_get_inum() 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski ` (2 preceding siblings ...) 2013-03-18 15:45 ` [PATCH RFC 3/8] pidns: introduce pidns_get_inum() Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 5/8] utsns: introduce utsns_get_inum() Aristeu Rozanski ` (3 subsequent siblings) 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit This allows other parts of the kernel to have access to userspace visible namespace identification. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/linux/user_namespace.h | 1 + kernel/user_namespace.c | 5 +++++ 2 files changed, 6 insertions(+), 0 deletions(-) diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h index 4ce0093..520d8b2 100644 --- a/include/linux/user_namespace.h +++ b/include/linux/user_namespace.h @@ -56,6 +56,7 @@ extern struct seq_operations proc_projid_seq_operations; extern ssize_t proc_uid_map_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t proc_gid_map_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t proc_projid_map_write(struct file *, const char __user *, size_t, loff_t *); +extern unsigned int userns_get_inum(struct task_struct *tsk); #else static inline struct user_namespace *get_user_ns(struct user_namespace *ns) diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 8b65083..9a0db6d 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -856,6 +856,11 @@ static unsigned int userns_inum(void *ns) return user_ns->proc_inum; } +unsigned int userns_get_inum(struct task_struct *tsk) +{ + return userns_inum(task_cred_xxx(tsk, user_ns)); +} + const struct proc_ns_operations userns_operations = { .name = "user", .type = CLONE_NEWUSER, -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 5/8] utsns: introduce utsns_get_inum() 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski ` (3 preceding siblings ...) 2013-03-18 15:45 ` [PATCH RFC 4/8] userns: introduce userns_get_inum() Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 6/8] netns: introduce netns_get_inum() Aristeu Rozanski ` (2 subsequent siblings) 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit This allows other parts of the kernel to have access to userspace visible namespace identification. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/linux/utsname.h | 1 + kernel/utsname.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 0 deletions(-) diff --git a/include/linux/utsname.h b/include/linux/utsname.h index 239e277..eed8ca8 100644 --- a/include/linux/utsname.h +++ b/include/linux/utsname.h @@ -36,6 +36,7 @@ static inline void get_uts_ns(struct uts_namespace *ns) extern struct uts_namespace *copy_utsname(unsigned long flags, struct user_namespace *user_ns, struct uts_namespace *old_ns); extern void free_uts_ns(struct kref *kref); +extern unsigned int utsns_get_inum(struct task_struct *tsk); static inline void put_uts_ns(struct uts_namespace *ns) { diff --git a/kernel/utsname.c b/kernel/utsname.c index a47fc5d..146e95c 100644 --- a/kernel/utsname.c +++ b/kernel/utsname.c @@ -130,6 +130,20 @@ static unsigned int utsns_inum(void *vp) return ns->proc_inum; } +unsigned int utsns_get_inum(struct task_struct *tsk) +{ + struct nsproxy *nsproxy; + unsigned int rc = 0; + + rcu_read_lock(); + nsproxy = task_nsproxy(tsk); + if (nsproxy) + rc = utsns_inum(task_nsproxy(tsk)->uts_ns); + rcu_read_unlock(); + + return rc; +} + const struct proc_ns_operations utsns_operations = { .name = "uts", .type = CLONE_NEWUTS, -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 6/8] netns: introduce netns_get_inum() 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski ` (4 preceding siblings ...) 2013-03-18 15:45 ` [PATCH RFC 5/8] utsns: introduce utsns_get_inum() Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 8/8] audit: allow user records to be created inside a container Aristeu Rozanski 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit This allows other parts of the kernel to have access to userspace visible namespace identification. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/net/net_namespace.h | 1 + net/core/net_namespace.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+), 0 deletions(-) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index de644bc..bb24cf4 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -134,6 +134,7 @@ extern struct net init_net; #ifdef CONFIG_NET_NS extern struct net *copy_net_ns(unsigned long flags, struct user_namespace *user_ns, struct net *old_net); +extern unsigned int netns_get_inum(struct task_struct *tsk); #else /* CONFIG_NET_NS */ #include <linux/sched.h> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 80e271d..76c89e5 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -664,6 +664,20 @@ static unsigned int netns_inum(void *ns) return net->proc_inum; } +unsigned int netns_get_inum(struct task_struct *tsk) +{ + struct nsproxy *nsproxy; + unsigned int rc = 0; + + rcu_read_lock(); + nsproxy = task_nsproxy(tsk); + if (nsproxy) + rc = netns_inum(nsproxy->net_ns); + rcu_read_unlock(); + + return rc; +} + const struct proc_ns_operations netns_operations = { .name = "net", .type = CLONE_NEWNET, -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 7/8] audit: report namespace information along with USER events 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski ` (5 preceding siblings ...) 2013-03-18 15:45 ` [PATCH RFC 6/8] netns: introduce netns_get_inum() Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 8/8] audit: allow user records to be created inside a container Aristeu Rozanski 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit For userspace generated events, include a record with the namespace procfs inode numbers the process belongs to. This allows to track down and filter audit messages by userspace. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/uapi/linux/audit.h | 1 + kernel/audit.c | 51 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 51 insertions(+), 1 deletions(-) diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h index 9f096f1..3ec3ccb 100644 --- a/include/uapi/linux/audit.h +++ b/include/uapi/linux/audit.h @@ -106,6 +106,7 @@ #define AUDIT_NETFILTER_PKT 1324 /* Packets traversing netfilter chains */ #define AUDIT_NETFILTER_CFG 1325 /* Netfilter chain modifications */ #define AUDIT_SECCOMP 1326 /* Secure Computing event */ +#define AUDIT_USER_NAMESPACE 1327 /* Information about process' namespaces */ #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */ #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */ diff --git a/kernel/audit.c b/kernel/audit.c index 58db117..b17f9c0 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -62,6 +62,11 @@ #include <linux/freezer.h> #include <linux/tty.h> #include <linux/pid_namespace.h> +#include <linux/ipc_namespace.h> +#include <linux/mnt_namespace.h> +#include <linux/utsname.h> +#include <linux/user_namespace.h> +#include <net/net_namespace.h> #include "audit.h" @@ -641,6 +646,49 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type, return rc; } +#ifdef CONFIG_NAMESPACES +static int audit_log_namespaces(struct task_struct *tsk, + struct sk_buff *skb) +{ + struct audit_context *ctx = tsk->audit_context; + struct audit_buffer *ab; + + if (!audit_enabled) + return 0; + + ab = audit_log_start(ctx, GFP_KERNEL, AUDIT_USER_NAMESPACE); + if (unlikely(!ab)) + return -ENOMEM; + + audit_log_format(ab, "mnt=%u", mntns_get_inum(tsk)); +#ifdef CONFIG_NET_NS + audit_log_format(ab, " net=%u", netns_get_inum(tsk)); +#endif +#ifdef CONFIG_UTS_NS + audit_log_format(ab, " uts=%u", utsns_get_inum(tsk)); +#endif +#ifdef CONFIG_IPC_NS + audit_log_format(ab, " ipc=%u", ipcns_get_inum(tsk)); +#endif +#ifdef CONFIG_PID_NS + audit_log_format(ab, " pid=%u", pidns_get_inum(tsk)); +#endif +#ifdef CONFIG_USER_NS + audit_log_format(ab, " user=%u", userns_get_inum(tsk)); +#endif + audit_set_pid(ab, NETLINK_CB(skb).portid); + audit_log_end(ab); + + return 0; +} +#else +static inline int audit_log_namespaces(struct task_struct *tsk, + struct sk_buff *skb) +{ + return 0; +} +#endif + static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) { u32 seq, sid; @@ -741,7 +789,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) } audit_log_common_recv_msg(&ab, msg_type, loginuid, sessionid, sid, - NULL); + current->audit_context); if (msg_type != AUDIT_USER_TTY) audit_log_format(ab, " msg='%.1024s'", @@ -758,6 +806,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) } audit_set_pid(ab, NETLINK_CB(skb).portid); audit_log_end(ab); + audit_log_namespaces(current, skb); } break; case AUDIT_ADD: -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH RFC 8/8] audit: allow user records to be created inside a container 2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski ` (6 preceding siblings ...) 2013-03-18 15:45 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 7 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit Since user events will be followed by namespace information, userspace can filter off undesired container records. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- kernel/audit.c | 25 ++++++++++++++++++++----- 1 files changed, 20 insertions(+), 5 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index b17f9c0..cc6ffc9 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -570,6 +570,23 @@ out: kfree(reply); } +static int audit_namespace_check(struct task_struct *tsk, u16 msg_type) +{ + /* USER messages are allowed from inside containers */ + switch (msg_type) { + case AUDIT_USER: + case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: + case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: + return 1; + default: + if ((current_user_ns() != &init_user_ns) || + (task_active_pid_ns(current) != &init_pid_ns)) + return 0; + break; + } + return 1; +} + /* * Check for appropriate CAP_AUDIT_ capabilities on incoming audit * control messages. @@ -578,9 +595,7 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type) { int err = 0; - /* Only support the initial namespaces for now. */ - if ((current_user_ns() != &init_user_ns) || - (task_active_pid_ns(current) != &init_pid_ns)) + if (!audit_namespace_check(current, msg_type)) return -EPERM; switch (msg_type) { @@ -597,13 +612,13 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type) case AUDIT_TTY_SET: case AUDIT_TRIM: case AUDIT_MAKE_EQUIV: - if (!capable(CAP_AUDIT_CONTROL)) + if (!nsown_capable(CAP_AUDIT_CONTROL)) err = -EPERM; break; case AUDIT_USER: case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: - if (!capable(CAP_AUDIT_WRITE)) + if (!nsown_capable(CAP_AUDIT_WRITE)) err = -EPERM; break; default: /* bad msg */ -- 1.7.1 ^ permalink raw reply related [flat|nested] 24+ messages in thread
[parent not found: <1363619405-6419-1-git-send-email-arozansk@redhat.com>]
[parent not found: <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-18 22:16 ` Eric W. Biederman [not found] ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Eric W. Biederman @ 2013-03-18 22:16 UTC (permalink / raw) To: Aristeu Rozanski Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Adding the containers list so folks with container expertise can see what is being proposed. Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > This patchset introduces a new audit record to follow all USER records which > provides namespace information of the process. The idea is to allow processes > in containers to create records in the host system while providing means to be > filtered out. It looks like this mechanism makes it easy for an unprivileged program to spam and overwhelm the audit log. > For each new namespace, a unique procfs inode number is allocated and this > number has been used by userspace to determine which processes belong to the > same namespace. These numbers are used in the new audit record. > > Applications such as libvirt-sandbox and lxc can then report the same numbers > when a container is created and destroyed allowing to map records to a certain > container. Maybe the next step would be having a record for whenever a new > namespace is created? > > First 6 patches are needed in order to get each namespace's inode number. Grumble the existing methods can be used you don't have to introduce a whole new set of methods. Grumble. Besides the bug of assuming that the inodes now and forever will be the same across all instances of proc. > Patch 7 properly defines the new record that is related to the USER > record Not agmenting the current user records seems a little odd to me. You also continue in this my current policy of not allowing any audit records in the container itself, so I a don't quite know what the point of all of this is. > Patch 8 allows USER records to be generated from different namespaces Which essentially allows any user to create any USER record they want whenever they want. > Here's an example of output: > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' Ok. This seems totally bizarre. You are running a container with a user namespace with some uid mapped to uid 0? That defeats about half the point of having user namespaces, as half the files in the world are owned by uid 0, and can be written by uid 0 outside of your user namespace. Hmm. I need to look at this in a little more detail but I believe our use of task_pid_vnr here in the audit record is a long standing bug. > type=UNKNOWN[1327] msg=audit(1363528861.403:311): mnt=4026531840 net=4026531956 uts=4026531838 ipc=4026531839 pid=4026531836 user=4026531837 > > Notes: > - this is a RFC, all sorts of feedback are much appreciated > - while the last patch allows a new userns to send audit records, I haven't > look yet on making sure it has proper capabilities so regular users' > containers can create records I don't think it does. > - the record number allocated is just a draft. If this patchset evolves into > something that can be merged, please advise which number number is the best > choice > - I'm not subscribed to the list, so please make sure I'm on the Cc list > > fs/namespace.c | 14 +++++++ > include/linux/ipc_namespace.h | 1 > include/linux/mnt_namespace.h | 2 + > include/linux/pid_namespace.h | 1 > include/linux/user_namespace.h | 1 > include/linux/utsname.h | 1 > include/net/net_namespace.h | 1 > include/uapi/linux/audit.h | 1 > ipc/namespace.c | 14 +++++++ > kernel/audit.c | 76 +++++++++++++++++++++++++++++++++++++---- > kernel/pid_namespace.c | 11 +++++ > kernel/user_namespace.c | 5 ++ > kernel/utsname.c | 14 +++++++ > net/core/net_namespace.c | 14 +++++++ > 14 files changed, 150 insertions(+), 6 deletions(-) ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2013-03-19 12:24 ` Aristeu Rozanski [not found] ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-19 12:24 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote: > Adding the containers list so folks with container expertise can see > what is being proposed. > > Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > > > This patchset introduces a new audit record to follow all USER records which > > provides namespace information of the process. The idea is to allow processes > > in containers to create records in the host system while providing means to be > > filtered out. > > It looks like this mechanism makes it easy for an unprivileged program > to spam and overwhelm the audit log. > > > For each new namespace, a unique procfs inode number is allocated and this > > number has been used by userspace to determine which processes belong to the > > same namespace. These numbers are used in the new audit record. > > > > Applications such as libvirt-sandbox and lxc can then report the same numbers > > when a container is created and destroyed allowing to map records to a certain > > container. Maybe the next step would be having a record for whenever a new > > namespace is created? > > > > First 6 patches are needed in order to get each namespace's inode number. > > Grumble the existing methods can be used you don't have to introduce a > whole new set of methods. Grumble. Besides the bug of assuming that > the inodes now and forever will be the same across all instances of > proc. the existing methods are for procfs use and I didn't want to abuse it. like I said the other email, the fact that it's not a reliable way to indefinitely describe a namespace due to multiple procfs instances or migration, the whole idea is flawed. > > Patch 7 properly defines the new record that is related to the USER > > record > > Not agmenting the current user records seems a little odd to me. > > You also continue in this my current policy of not allowing any audit > records in the container itself, so I a don't quite know what the point > of all of this is. your current policy wasn't known to me and /* Only support the initial namespaces for now. */ sounds like something that didn't happen for other reasons > > Patch 8 allows USER records to be generated from different namespaces > > Which essentially allows any user to create any USER record they want > whenever they want. > > > Here's an example of output: > > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' > > Ok. This seems totally bizarre. You are running a container with a > user namespace with some uid mapped to uid 0? on the notes section: - while the last patch allows a new userns to send audit records, I haven't look yet on making sure it has proper capabilities so regular users' containers can create records so I haven't tried it with userns. It's a RFC. That's a regular record to show the related records, using initial namespaces. like I stated in the email, I wasn't sure how I'd handle capabilities but the idea would be to allow containers to log to the system's auditd. since inode numbers aren't more reliable for more than a moment, I guess there's no other way than having an audit namespace and run an audit daemon inside the container (and communicate over the network like an individual host). -- Aristeu ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-20 0:00 ` Eric W. Biederman [not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Eric W. Biederman @ 2013-03-20 0:00 UTC (permalink / raw) To: Aristeu Rozanski Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote: >> Adding the containers list so folks with container expertise can see >> what is being proposed. >> >> Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: >> >> > This patchset introduces a new audit record to follow all USER records which >> > provides namespace information of the process. The idea is to allow processes >> > in containers to create records in the host system while providing means to be >> > filtered out. >> >> It looks like this mechanism makes it easy for an unprivileged program >> to spam and overwhelm the audit log. >> >> > For each new namespace, a unique procfs inode number is allocated and this >> > number has been used by userspace to determine which processes belong to the >> > same namespace. These numbers are used in the new audit record. >> > >> > Applications such as libvirt-sandbox and lxc can then report the same numbers >> > when a container is created and destroyed allowing to map records to a certain >> > container. Maybe the next step would be having a record for whenever a new >> > namespace is created? >> > >> > First 6 patches are needed in order to get each namespace's inode number. >> >> Grumble the existing methods can be used you don't have to introduce a >> whole new set of methods. Grumble. Besides the bug of assuming that >> the inodes now and forever will be the same across all instances of >> proc. > > the existing methods are for procfs use and I didn't want to abuse it. > like I said the other email, the fact that it's not a reliable way to > indefinitely describe a namespace due to multiple procfs instances or > migration, the whole idea is flawed. It is always possible to pick the instance of /proc connected to the initial pid namespace. And there is a device number you can use to say that. Usually designs that need global identifiers for namespaces suffer from the need for a namespace of namespaces (which we sort of have in /proc), and I push back by default to get people to think if what they are trying to do really makes sense. >> > Patch 7 properly defines the new record that is related to the USER >> > record >> >> Not agmenting the current user records seems a little odd to me. >> >> You also continue in this my current policy of not allowing any audit >> records in the container itself, so I a don't quite know what the point >> of all of this is. > > your current policy wasn't known to me and > /* Only support the initial namespaces for now. */ > sounds like something that didn't happen for other reasons The reasons were simply that to my knowledge no one has thought through how audit records and namespaces make sense to interact. My expectation would be that an extention of audit records would be logged on a per container basis. But I don't have any motivating examples. >> > Patch 8 allows USER records to be generated from different namespaces >> >> Which essentially allows any user to create any USER record they want >> whenever they want. >> >> > Here's an example of output: >> > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' >> >> Ok. This seems totally bizarre. You are running a container with a >> user namespace with some uid mapped to uid 0? > > on the notes section: > - while the last patch allows a new userns to send audit records, I haven't > look yet on making sure it has proper capabilities so regular users' > containers can create records > > so I haven't tried it with userns. It's a RFC. I though you would have taken the time to run it at least once, or to perhaps have manually edited your example to see how things would fit together. > That's a regular record > to show the related records, using initial namespaces. like I stated in > the email, I wasn't sure how I'd handle capabilities but the idea would be > to allow containers to log to the system's auditd. since inode numbers > aren't more reliable for more than a moment, I guess there's no other > way than having an audit namespace and run an audit daemon inside the > container (and communicate over the network like an individual host). What was really missing from your RFC is a motivating example. I sort of see that in your paragraph above but it isn't clear to me. What is lost by not allowing USER audit records from processes in containers? What is gained by implementing user process to have them? And of course what are your thoughts on preventing unprivileged users overwhelming the audit subsystem. My minimal experience with the audit subsystem roughly feels like hardly anyone really cares. Although I may be wrong. Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2013-03-20 15:12 ` Serge Hallyn 2013-03-20 15:45 ` Aristeu Rozanski 1 sibling, 0 replies; 24+ messages in thread From: Serge Hallyn @ 2013-03-20 15:12 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org): > Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > The reasons were simply that to my knowledge no one has thought through > how audit records and namespaces make sense to interact. It seems clear to me (perhaps wrongly :) that: 1. auditd is a host service only. 2. in cases where the namespace is hierarchical and resources have identifiers in the init namespace (i.e. pid and user ns), audit should simply, always, report the id in the init ns 3. in cases where namespaces are not hierarchical (ipc, netns) the (ns_id, resource_id) need to be dumped. The ns_id should be the inode # for the /proc/$$/ns/$namespace, since that is what is used for setns. Syslog I want eventually to be namespaced. Audit, not. Audit is (ISTM) about LSPP and such - things which we can't talk about in containers anyway. -serge ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 2013-03-20 15:12 ` Serge Hallyn @ 2013-03-20 15:45 ` Aristeu Rozanski [not found] ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-20 15:45 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris On Tue, Mar 19, 2013 at 05:00:50PM -0700, Eric W. Biederman wrote: > It is always possible to pick the instance of /proc connected to the > initial pid namespace. And there is a device number you can use to say > that. I wasn't aware of that, I'll take a look, thanks! > The reasons were simply that to my knowledge no one has thought through > how audit records and namespaces make sense to interact. > > My expectation would be that an extention of audit records would be > logged on a per container basis. But I don't have any motivating > examples. from what I've heard, there're two possibilites here: if a container is understood to be "light virtualization", it should behave just like another machine by having its own auditd daemon, sending records over the network to the host. If that's not the case, a single auditd must be present. But, the fact that you might want to run a sshd server inside a container it might be desirable to have USER_AUTH records for example. > I though you would have taken the time to run it at least once, or to > perhaps have manually edited your example to see how things would fit > together. I did run it with different namespaces but not with userns. The example was to show how the extra record would look like and I randomly picked one. The idea is that auditd will know which namespaces are the original ones and can use that to filter containers' records, which could be filtered out by default. > What was really missing from your RFC is a motivating example. I sort > of see that in your paragraph above but it isn't clear to me. > > What is lost by not allowing USER audit records from processes in > containers? What is gained by implementing user process to have them? > And of course what are your thoughts on preventing unprivileged users > overwhelming the audit subsystem. This is a bit fuzzy to me, perhaps due I'm not fully understanding userns implementation yet, so bear with me: I thought of changing so userns would not grant CAP_AUDIT_WRITE and CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require to be root on the init_ns). The 'init' process would start trusted daemons with those capabilities then drop the capabilities for everything else. Does it make sense? -- Aristeu ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-20 18:36 ` Serge Hallyn 2013-03-20 18:42 ` Eric Paris 0 siblings, 1 reply; 24+ messages in thread From: Serge Hallyn @ 2013-03-20 18:36 UTC (permalink / raw) To: Aristeu Rozanski Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman, Eric Paris Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > This is a bit fuzzy to me, perhaps due I'm not fully understanding > userns implementation yet, so bear with me: > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require Seems like CAP_AUDIT_WRITE should be targeted against the skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other capability. Last I knew (long time ago) you had to be in init_user_ns to talk audit, but that's ok - this would just do the right thing in any case. -serge ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 18:36 ` Serge Hallyn @ 2013-03-20 18:42 ` Eric Paris 2013-03-20 18:49 ` Serge Hallyn 0 siblings, 1 reply; 24+ messages in thread From: Eric Paris @ 2013-03-20 18:42 UTC (permalink / raw) To: Serge Hallyn Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > userns implementation yet, so bear with me: > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > Seems like CAP_AUDIT_WRITE should be targeted against the > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > capability. Last I knew (long time ago) you had to be in init_user_ns > to talk audit, but that's ok - this would just do the right thing in > any case. kauditd should be considered as existing in the init user namespace. So I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the init user namespace and if so, allow it to send messages. Who care what *ns the process exists in. If it has it in the init namespace, go ahead. Thus the process that created the container would need CAP_AUDIT_WRITE in the init namespace for this to all work, right? /me also gets so confused about what caps mean in the userns world. (/me has larger issues with the ns concept as a whole, but that boat sailed years and years ago) ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 18:42 ` Eric Paris @ 2013-03-20 18:49 ` Serge Hallyn 2013-03-20 19:01 ` Eric Paris 0 siblings, 1 reply; 24+ messages in thread From: Serge Hallyn @ 2013-03-20 18:49 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > > userns implementation yet, so bear with me: > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > > > Seems like CAP_AUDIT_WRITE should be targeted against the > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > > capability. Last I knew (long time ago) you had to be in init_user_ns > > to talk audit, but that's ok - this would just do the right thing in > > any case. > > kauditd should be considered as existing in the init user namespace. So > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > init user namespace and if so, allow it to send messages. Who care what > *ns the process exists in. If it has it in the init namespace, go > ahead. Thus the process that created the container would need > CAP_AUDIT_WRITE in the init namespace for this to all work, right? Yes. What I was suggesting is intended to work if that situation ever changes. But I have zero complaints about doing it as you say, as I doubt it ever will/ought to change. That basically means CAP_AUDIT_WRITE would be worthless in a non-init userns. That's fine - at least the rules would be consistent. > /me also gets so confused about what caps mean in the userns world. If the resource in question (like a network interface) belongs to a namespace (netns) created by the userns in which the caller has the caps in question, then privilege is granted. Otherwise, not. What you're saying above about CAP_AUDIT_WRITE is exactly right (for how audit works right now). > (/me has larger issues with the ns concept as a whole, but that boat > sailed years and years ago) -serge ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 18:49 ` Serge Hallyn @ 2013-03-20 19:01 ` Eric Paris 2013-03-20 19:17 ` Aristeu Rozanski ` (2 more replies) 0 siblings, 3 replies; 24+ messages in thread From: Eric Paris @ 2013-03-20 19:01 UTC (permalink / raw) To: Serge Hallyn Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > > > userns implementation yet, so bear with me: > > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > > > > > Seems like CAP_AUDIT_WRITE should be targeted against the > > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > > > capability. Last I knew (long time ago) you had to be in init_user_ns > > > to talk audit, but that's ok - this would just do the right thing in > > > any case. > > > > kauditd should be considered as existing in the init user namespace. So > > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > > init user namespace and if so, allow it to send messages. Who care what > > *ns the process exists in. If it has it in the init namespace, go > > ahead. Thus the process that created the container would need > > CAP_AUDIT_WRITE in the init namespace for this to all work, right? > > Yes. What I was suggesting is intended to work if that situation ever > changes. But I have zero complaints about doing it as you say, as I > doubt it ever will/ought to change. > > That basically means CAP_AUDIT_WRITE would be worthless in a non-init > userns. That's fine - at least the rules would be consistent. [veering away from this particular patch] We are also talking about adding a CAP_AUDIT_READ and sending messages via multicast on the audit socket. The problem is I don't know how the audit socket could work in the network namespace world. Right now kauditd has: audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); So there won't ever be anything on the kernel side of the audit socket in a non-init network namespace. Lets say that is fixed somehow (I assume it's possible? something? magic pixies?) I think we'd somehow need to do the CAP_AUDIT_READ check against the user namespace associated with the network namespace in question? But what messages should go to this userspace auditd? Going to have to have audit namespaces to. But only CAP_AUDIT_READ would make sense in the new audit namespace... /me wishes containers were a 'thing' instead of a bucket of semi-related nuts and bolts. -Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 19:01 ` Eric Paris @ 2013-03-20 19:17 ` Aristeu Rozanski 2013-03-20 19:19 ` Serge Hallyn 2013-03-20 23:23 ` Eric W. Biederman 2 siblings, 0 replies; 24+ messages in thread From: Aristeu Rozanski @ 2013-03-20 19:17 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, Serge Hallyn, Eric W. Biederman, linux-audit-H+wXaHxf7aLQT0dZR+AlfA On Wed, Mar 20, 2013 at 03:01:32PM -0400, Eric Paris wrote: > [veering away from this particular patch] > > We are also talking about adding a CAP_AUDIT_READ and sending messages > via multicast on the audit socket. The problem is I don't know how the > audit socket could work in the network namespace world. Right now > kauditd has: > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > So there won't ever be anything on the kernel side of the audit socket > in a non-init network namespace. Lets say that is fixed somehow (I > assume it's possible? something? magic pixies?) I think we'd somehow > need to do the CAP_AUDIT_READ check against the user namespace > associated with the network namespace in question? But what messages > should go to this userspace auditd? > > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > would make sense in the new audit namespace... I guess that could be achieved by forcing creating a new network namespace at the same time you create a new audit namespace. any new network namespace created inside this new container would lose CAP_AUDIT_*. -- Aristeu ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 19:01 ` Eric Paris 2013-03-20 19:17 ` Aristeu Rozanski @ 2013-03-20 19:19 ` Serge Hallyn 2013-03-20 23:23 ` Eric W. Biederman 2 siblings, 0 replies; 24+ messages in thread From: Serge Hallyn @ 2013-03-20 19:19 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: > > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > > > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > > > > userns implementation yet, so bear with me: > > > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > > > > > > > Seems like CAP_AUDIT_WRITE should be targeted against the > > > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > > > > capability. Last I knew (long time ago) you had to be in init_user_ns > > > > to talk audit, but that's ok - this would just do the right thing in > > > > any case. > > > > > > kauditd should be considered as existing in the init user namespace. So > > > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > > > init user namespace and if so, allow it to send messages. Who care what > > > *ns the process exists in. If it has it in the init namespace, go > > > ahead. Thus the process that created the container would need > > > CAP_AUDIT_WRITE in the init namespace for this to all work, right? > > > > Yes. What I was suggesting is intended to work if that situation ever > > changes. But I have zero complaints about doing it as you say, as I > > doubt it ever will/ought to change. > > > > That basically means CAP_AUDIT_WRITE would be worthless in a non-init > > userns. That's fine - at least the rules would be consistent. > > [veering away from this particular patch] > > We are also talking about adding a CAP_AUDIT_READ and sending messages > via multicast on the audit socket. The problem is I don't know how the > audit socket could work in the network namespace world. Right now > kauditd has: > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > So there won't ever be anything on the kernel side of the audit socket > in a non-init network namespace. Right. > Lets say that is fixed somehow (I > assume it's possible? something? magic pixies?) I think we'd somehow > need to do the CAP_AUDIT_READ check against the user namespace > associated with the network namespace in question? But what messages > should go to this userspace auditd? Ones which pertain to resources in that userns. If we ever were to sprinkle that pixie dust, then we'd know how to do this as well :) > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > would make sense in the new audit namespace... It's not clear to me that an audit namespace is needed. The userns 'owns' other namespaces, so it seems like it should suffice for directing audit msgs. > /me wishes containers were a 'thing' instead of a bucket of semi-related > nuts and bolts. That sure would simplify things. However there definately are heavy users of individual namespaces - i.e. using thousands of network namespaces but no other namespaces. -serge ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 19:01 ` Eric Paris 2013-03-20 19:17 ` Aristeu Rozanski 2013-03-20 19:19 ` Serge Hallyn @ 2013-03-20 23:23 ` Eric W. Biederman [not found] ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 2 siblings, 1 reply; 24+ messages in thread From: Eric W. Biederman @ 2013-03-20 23:23 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, Serge Hallyn, linux-audit-H+wXaHxf7aLQT0dZR+AlfA Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: >> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: >> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding >> > > > userns implementation yet, so bear with me: >> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and >> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require >> > > >> > > Seems like CAP_AUDIT_WRITE should be targeted against the >> > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other >> > > capability. Last I knew (long time ago) you had to be in init_user_ns >> > > to talk audit, but that's ok - this would just do the right thing in >> > > any case. >> > >> > kauditd should be considered as existing in the init user namespace. So >> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the >> > init user namespace and if so, allow it to send messages. Who care what >> > *ns the process exists in. If it has it in the init namespace, go >> > ahead. Thus the process that created the container would need >> > CAP_AUDIT_WRITE in the init namespace for this to all work, right? >> >> Yes. What I was suggesting is intended to work if that situation ever >> changes. But I have zero complaints about doing it as you say, as I >> doubt it ever will/ought to change. >> >> That basically means CAP_AUDIT_WRITE would be worthless in a non-init >> userns. That's fine - at least the rules would be consistent. > > [veering away from this particular patch] > > We are also talking about adding a CAP_AUDIT_READ and sending messages > via multicast on the audit socket. The problem is I don't know how the > audit socket could work in the network namespace world. Hmm. I don't quite know how CAP_AUDIT_READ could work. When delivering a message to a socket you really don't know who is on the other end. > Right now kauditd has: > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > So there won't ever be anything on the kernel side of the audit socket > in a non-init network namespace. Lets say that is fixed somehow (I > assume it's possible? something? magic pixies?) One socket for each network namespace... It is a pain but doable. > I think we'd somehow > need to do the CAP_AUDIT_READ check against the user namespace > associated with the network namespace in question? But what messages > should go to this userspace auditd? Messages generated by processes in that user namespace? > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > would make sense in the new audit namespace... Given the connection of audit and security I think if we add support for a non-global auditd the user namespace seems to fit. The user namespace is certainly where all of the security connected bits go. Architecturally it gets a little tricky as it seems to make sense to generate audit messages that make sense to the process receiving them, which would mean actually generating a different audit message for different receiving contexts. I find the auditsc code odd. We log file descriptor numbers when a file is mmaped? What is something so process relative good to anyone? On a slightly different tangent. Do we want to update the AUDIT_CAPSET message to report the user namespace whose caps we are changing or perhaps to surpress the message outside of the initial user namespace. Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2013-03-21 1:46 ` Eric Paris 2013-03-21 2:21 ` Serge Hallyn 0 siblings, 1 reply; 24+ messages in thread From: Eric Paris @ 2013-03-21 1:46 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, Serge Hallyn, linux-audit-H+wXaHxf7aLQT0dZR+AlfA On Wed, 2013-03-20 at 16:23 -0700, Eric W. Biederman wrote: > Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > > > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: > >> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > >> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > >> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > >> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > >> > > > userns implementation yet, so bear with me: > >> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > >> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > >> > > > >> > > Seems like CAP_AUDIT_WRITE should be targeted against the > >> > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > >> > > capability. Last I knew (long time ago) you had to be in init_user_ns > >> > > to talk audit, but that's ok - this would just do the right thing in > >> > > any case. > >> > > >> > kauditd should be considered as existing in the init user namespace. So > >> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > >> > init user namespace and if so, allow it to send messages. Who care what > >> > *ns the process exists in. If it has it in the init namespace, go > >> > ahead. Thus the process that created the container would need > >> > CAP_AUDIT_WRITE in the init namespace for this to all work, right? > >> > >> Yes. What I was suggesting is intended to work if that situation ever > >> changes. But I have zero complaints about doing it as you say, as I > >> doubt it ever will/ought to change. > >> > >> That basically means CAP_AUDIT_WRITE would be worthless in a non-init > >> userns. That's fine - at least the rules would be consistent. > > > > [veering away from this particular patch] > > > > We are also talking about adding a CAP_AUDIT_READ and sending messages > > via multicast on the audit socket. The problem is I don't know how the > > audit socket could work in the network namespace world. > > Hmm. I don't quite know how CAP_AUDIT_READ could work. When delivering > a message to a socket you really don't know who is on the other end. > > > Right now kauditd has: > > > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > > > So there won't ever be anything on the kernel side of the audit socket > > in a non-init network namespace. Lets say that is fixed somehow (I > > assume it's possible? something? magic pixies?) > > One socket for each network namespace... It is a pain but doable. > > > I think we'd somehow > > need to do the CAP_AUDIT_READ check against the user namespace > > associated with the network namespace in question? But what messages > > should go to this userspace auditd? > > Messages generated by processes in that user namespace? So the kernel socket(s) would be per network namespace, but we divide messages per user namespace? Which socket do I send them on, considering the possible crazy many<->many mappings between user and network namespaces. It all makes me cry a little. > > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > > would make sense in the new audit namespace... > > Given the connection of audit and security I think if we add support for > a non-global auditd the user namespace seems to fit. The user namespace > is certainly where all of the security connected bits go. > > Architecturally it gets a little tricky as it seems to make sense to > generate audit messages that make sense to the process receiving them, > which would mean actually generating a different audit message for > different receiving contexts. Assuming as today, we only have 1 auditd and it is system wide. We just attach consistent identifiable information (aka proc inode number, which people already use) to the audit records (this patch only does user messages, but attaching to all messages needs to be done). Moving to multiple auditd's starts to get really hard, and we might not ever pursue it :) > I find the auditsc code odd. We log file descriptor numbers when a file > is mmaped? What is something so process relative good to anyone? When an earlier record showed that fd being opened? I dunno.... > On a slightly different tangent. Do we want to update the AUDIT_CAPSET > message to report the user namespace whose caps we are changing or > perhaps to surpress the message outside of the initial user namespace. The extension of Aris's patch to syscall audit instead of just userspace audit would take care of this. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-21 1:46 ` Eric Paris @ 2013-03-21 2:21 ` Serge Hallyn 2013-03-21 4:48 ` Eric W. Biederman 0 siblings, 1 reply; 24+ messages in thread From: Serge Hallyn @ 2013-03-21 2:21 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > So the kernel socket(s) would be per network namespace, but we divide > messages per user namespace? Which socket do I send them on, > considering the possible crazy many<->many mappings between user and > network namespaces. It all makes me cry a little. not many-many - each netns is owned by exactly one userns. The userns from which the netns was created. -serge ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-21 2:21 ` Serge Hallyn @ 2013-03-21 4:48 ` Eric W. Biederman 0 siblings, 0 replies; 24+ messages in thread From: Eric W. Biederman @ 2013-03-21 4:48 UTC (permalink / raw) To: Serge Hallyn Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> writes: > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> So the kernel socket(s) would be per network namespace, but we divide >> messages per user namespace? Which socket do I send them on, >> considering the possible crazy many<->many mappings between user and >> network namespaces. It all makes me cry a little. > > not many-many - each netns is owned by exactly one userns. The userns > from which the netns was created. Doh. I missed this question and I think I misunderstood when Eric Paris was talking about multicasting audit messages. If what we are really talking about is sending some audit messages to an auditd in a container what appears obvious to me is that we define a per user namespace capability something like CAP_AUDIT_CONTROL. That does most or all of what CAP_AUDIT_CONTROL does in the init user namespace. Especially capturing audit_pid and audit_nlk_portid to decide who to send the message to. Something like: struct audit_control { int initialized; pid_t pid; u32 nlk_portid; }; struct user_namespace { ... struct audit_contol audit; }; Then the transmission would be something like: struct user_namespace *user_ns = ...; for (;;) { if (ns->audit_pid) { err = netlink_unicast(ns->audit.sock, skb, ns->audit.nlk_portid, 0); } if (!ns->parent) break; ns = ns->parent; } If someone finds auditd interesting enough to do that work. In general I think it only makes sense if we can reuse the existing userspace auditd. Eric ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2013-03-21 4:48 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-18 15:45 [PATCH RFC] audit: provide namespace information in user originated records Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 1/8] mntns: introduce mntns_get_inum() Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 2/8] ipcns: introduce ipcns_get_inum() Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 3/8] pidns: introduce pidns_get_inum() Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 4/8] userns: introduce userns_get_inum() Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 5/8] utsns: introduce utsns_get_inum() Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 6/8] netns: introduce netns_get_inum() Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 8/8] audit: allow user records to be created inside a container Aristeu Rozanski
[not found] <1363619405-6419-1-git-send-email-arozansk@redhat.com>
[not found] ` <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-18 22:16 ` [PATCH RFC] audit: provide namespace information in user originated records Eric W. Biederman
[not found] ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-19 12:24 ` Aristeu Rozanski
[not found] ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20 0:00 ` Eric W. Biederman
[not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-20 15:12 ` Serge Hallyn
2013-03-20 15:45 ` Aristeu Rozanski
[not found] ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20 18:36 ` Serge Hallyn
2013-03-20 18:42 ` Eric Paris
2013-03-20 18:49 ` Serge Hallyn
2013-03-20 19:01 ` Eric Paris
2013-03-20 19:17 ` Aristeu Rozanski
2013-03-20 19:19 ` Serge Hallyn
2013-03-20 23:23 ` Eric W. Biederman
[not found] ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-21 1:46 ` Eric Paris
2013-03-21 2:21 ` Serge Hallyn
2013-03-21 4:48 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox