* Re: [PATCH RFC 8/8] audit: allow user records to be created inside a container [not found] ` <1363619405-6419-9-git-send-email-arozansk@redhat.com> @ 2013-03-18 21:28 ` Eric W. Biederman 0 siblings, 0 replies; 20+ messages in thread From: Eric W. Biederman @ 2013-03-18 21:28 UTC (permalink / raw) To: Aristeu Rozanski; +Cc: linux-audit Aristeu Rozanski <arozansk@redhat.com> writes: > Since user events will be followed by namespace information, userspace > can filter off undesired container records. I don't think we want to allow any user to write to the audit records, that is what nsown_capable will allow, as all you would need to do is to unshare the user namespace to be able to write audit records. Eric > @@ -597,13 +612,13 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type) > case AUDIT_TTY_SET: > case AUDIT_TRIM: > case AUDIT_MAKE_EQUIV: > - if (!capable(CAP_AUDIT_CONTROL)) > + if (!nsown_capable(CAP_AUDIT_CONTROL)) > err = -EPERM; > break; > case AUDIT_USER: > case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: > case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: > - if (!capable(CAP_AUDIT_WRITE)) > + if (!nsown_capable(CAP_AUDIT_WRITE)) > err = -EPERM; > break; > default: /* bad msg */ ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1363619405-6419-8-git-send-email-arozansk@redhat.com>]
* Re: [PATCH RFC 7/8] audit: report namespace information along with USER events [not found] ` <1363619405-6419-8-git-send-email-arozansk@redhat.com> @ 2013-03-18 21:44 ` Eric W. Biederman 2013-03-19 12:08 ` Aristeu Rozanski [not found] ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 0 siblings, 2 replies; 20+ messages in thread From: Eric W. Biederman @ 2013-03-18 21:44 UTC (permalink / raw) To: Aristeu Rozanski; +Cc: linux-audit Aristeu Rozanski <arozansk@redhat.com> writes: > For userspace generated events, include a record with the namespace > procfs inode numbers the process belongs to. This allows to track down > and filter audit messages by userspace. I am not comfortable with using the inode numbers this way. It does not pass the test of can I migrate a container and still have this work test. Any kind of kernel assigned name for namespaces fails that test. I also don't like that you don't include the procfs device number. An inode number means nothing without knowing which filesystem you are referring to. It may never happen but I reserve the right to have the inode numbers for namespaces to show up differently in different instances of procfs. Beyond that I think this usage is possibly buggy by using two audit records for one event. > Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> > --- > include/uapi/linux/audit.h | 1 + > kernel/audit.c | 51 +++++++++++++++++++++++++++++++++++++++++++- > 2 files changed, 51 insertions(+), 1 deletions(-) > > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h > index 9f096f1..3ec3ccb 100644 > --- a/include/uapi/linux/audit.h > +++ b/include/uapi/linux/audit.h > @@ -106,6 +106,7 @@ > #define AUDIT_NETFILTER_PKT 1324 /* Packets traversing netfilter chains */ > #define AUDIT_NETFILTER_CFG 1325 /* Netfilter chain modifications */ > #define AUDIT_SECCOMP 1326 /* Secure Computing event */ > +#define AUDIT_USER_NAMESPACE 1327 /* Information about process' namespaces */ > > #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */ > #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */ > diff --git a/kernel/audit.c b/kernel/audit.c > index 58db117..b17f9c0 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -62,6 +62,11 @@ > #include <linux/freezer.h> > #include <linux/tty.h> > #include <linux/pid_namespace.h> > +#include <linux/ipc_namespace.h> > +#include <linux/mnt_namespace.h> > +#include <linux/utsname.h> > +#include <linux/user_namespace.h> > +#include <net/net_namespace.h> > > #include "audit.h" > > @@ -641,6 +646,49 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type, > return rc; > } > > +#ifdef CONFIG_NAMESPACES > +static int audit_log_namespaces(struct task_struct *tsk, > + struct sk_buff *skb) > +{ > + struct audit_context *ctx = tsk->audit_context; > + struct audit_buffer *ab; > + > + if (!audit_enabled) > + return 0; > + > + ab = audit_log_start(ctx, GFP_KERNEL, AUDIT_USER_NAMESPACE); > + if (unlikely(!ab)) > + return -ENOMEM; > + > + audit_log_format(ab, "mnt=%u", mntns_get_inum(tsk)); > +#ifdef CONFIG_NET_NS > + audit_log_format(ab, " net=%u", netns_get_inum(tsk)); > +#endif > +#ifdef CONFIG_UTS_NS > + audit_log_format(ab, " uts=%u", utsns_get_inum(tsk)); > +#endif > +#ifdef CONFIG_IPC_NS > + audit_log_format(ab, " ipc=%u", ipcns_get_inum(tsk)); > +#endif > +#ifdef CONFIG_PID_NS > + audit_log_format(ab, " pid=%u", pidns_get_inum(tsk)); > +#endif > +#ifdef CONFIG_USER_NS > + audit_log_format(ab, " user=%u", userns_get_inum(tsk)); > +#endif > + audit_set_pid(ab, NETLINK_CB(skb).portid); > + audit_log_end(ab); > + > + return 0; > +} > +#else > +static inline int audit_log_namespaces(struct task_struct *tsk, > + struct sk_buff *skb) > +{ > + return 0; > +} > +#endif > + > static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > { > u32 seq, sid; > @@ -741,7 +789,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > } > audit_log_common_recv_msg(&ab, msg_type, > loginuid, sessionid, sid, > - NULL); > + current->audit_context); > > if (msg_type != AUDIT_USER_TTY) > audit_log_format(ab, " msg='%.1024s'", > @@ -758,6 +806,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > } > audit_set_pid(ab, NETLINK_CB(skb).portid); > audit_log_end(ab); > + audit_log_namespaces(current, skb); > } > break; > case AUDIT_ADD: ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC 7/8] audit: report namespace information along with USER events 2013-03-18 21:44 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Eric W. Biederman @ 2013-03-19 12:08 ` Aristeu Rozanski [not found] ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 1 sibling, 0 replies; 20+ messages in thread From: Aristeu Rozanski @ 2013-03-19 12:08 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-audit On Mon, Mar 18, 2013 at 02:44:33PM -0700, Eric W. Biederman wrote: > Aristeu Rozanski <arozansk@redhat.com> writes: > > > For userspace generated events, include a record with the namespace > > procfs inode numbers the process belongs to. This allows to track down > > and filter audit messages by userspace. > > I am not comfortable with using the inode numbers this way. It does not > pass the test of can I migrate a container and still have this work > test. Any kind of kernel assigned name for namespaces fails that test. > > I also don't like that you don't include the procfs device number. An > inode number means nothing without knowing which filesystem you are > referring to. > > It may never happen but I reserve the right to have the inode numbers > for namespaces to show up differently in different instances of procfs. well, in this case the whole idea is invalid. there's no way to reliably identify which namespaces a process belongs to for logging purposes. > Beyond that I think this usage is possibly buggy by using two audit > records for one event. this is valid, the records are related and they show up with the same timestamp. -- Aristeu ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC 7/8] audit: report namespace information along with USER events [not found] ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2014-01-24 6:19 ` Richard Guy Briggs 0 siblings, 0 replies; 20+ messages in thread From: Richard Guy Briggs @ 2014-01-24 6:19 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA On 13/03/18, Eric W. Biederman wrote: > Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: (Digging up an old thread...) > > For userspace generated events, include a record with the namespace > > procfs inode numbers the process belongs to. This allows to track down > > and filter audit messages by userspace. > > I am not comfortable with using the inode numbers this way. It does not > pass the test of can I migrate a container and still have this work > test. Any kind of kernel assigned name for namespaces fails that test. Any kind? How about if we have a systemwide atomically incremented serial number assigned every time a namespace is created? This is close to what the inode number was except the inode could be in a different proc device, as pointed out. > I also don't like that you don't include the procfs device number. An > inode number means nothing without knowing which filesystem you are > referring to. I'm looking at having everything relative to init_*_ns to start with, so this isn't a problem initially, but may become so if it isn't the case. Can anyone point out off-hand how to find that proc device number? (I'll start looking...) > It may never happen but I reserve the right to have the inode numbers > for namespaces to show up differently in different instances of procfs. So would that serial number idea work better? > Beyond that I think this usage is possibly buggy by using two audit > records for one event. I'm looking at integrating this information into a standard message. > > Signed-off-by: Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > --- > > include/uapi/linux/audit.h | 1 + > > kernel/audit.c | 51 +++++++++++++++++++++++++++++++++++++++++++- > > 2 files changed, 51 insertions(+), 1 deletions(-) > > > > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h > > index 9f096f1..3ec3ccb 100644 > > --- a/include/uapi/linux/audit.h > > +++ b/include/uapi/linux/audit.h > > @@ -106,6 +106,7 @@ > > #define AUDIT_NETFILTER_PKT 1324 /* Packets traversing netfilter chains */ > > #define AUDIT_NETFILTER_CFG 1325 /* Netfilter chain modifications */ > > #define AUDIT_SECCOMP 1326 /* Secure Computing event */ > > +#define AUDIT_USER_NAMESPACE 1327 /* Information about process' namespaces */ > > > > #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */ > > #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */ > > diff --git a/kernel/audit.c b/kernel/audit.c > > index 58db117..b17f9c0 100644 > > --- a/kernel/audit.c > > +++ b/kernel/audit.c > > @@ -62,6 +62,11 @@ > > #include <linux/freezer.h> > > #include <linux/tty.h> > > #include <linux/pid_namespace.h> > > +#include <linux/ipc_namespace.h> > > +#include <linux/mnt_namespace.h> > > +#include <linux/utsname.h> > > +#include <linux/user_namespace.h> > > +#include <net/net_namespace.h> > > > > #include "audit.h" > > > > @@ -641,6 +646,49 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type, > > return rc; > > } > > > > +#ifdef CONFIG_NAMESPACES > > +static int audit_log_namespaces(struct task_struct *tsk, > > + struct sk_buff *skb) > > +{ > > + struct audit_context *ctx = tsk->audit_context; > > + struct audit_buffer *ab; > > + > > + if (!audit_enabled) > > + return 0; > > + > > + ab = audit_log_start(ctx, GFP_KERNEL, AUDIT_USER_NAMESPACE); > > + if (unlikely(!ab)) > > + return -ENOMEM; > > + > > + audit_log_format(ab, "mnt=%u", mntns_get_inum(tsk)); > > +#ifdef CONFIG_NET_NS > > + audit_log_format(ab, " net=%u", netns_get_inum(tsk)); > > +#endif > > +#ifdef CONFIG_UTS_NS > > + audit_log_format(ab, " uts=%u", utsns_get_inum(tsk)); > > +#endif > > +#ifdef CONFIG_IPC_NS > > + audit_log_format(ab, " ipc=%u", ipcns_get_inum(tsk)); > > +#endif > > +#ifdef CONFIG_PID_NS > > + audit_log_format(ab, " pid=%u", pidns_get_inum(tsk)); > > +#endif > > +#ifdef CONFIG_USER_NS > > + audit_log_format(ab, " user=%u", userns_get_inum(tsk)); > > +#endif > > + audit_set_pid(ab, NETLINK_CB(skb).portid); > > + audit_log_end(ab); > > + > > + return 0; > > +} > > +#else > > +static inline int audit_log_namespaces(struct task_struct *tsk, > > + struct sk_buff *skb) > > +{ > > + return 0; > > +} > > +#endif > > + > > static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > { > > u32 seq, sid; > > @@ -741,7 +789,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > } > > audit_log_common_recv_msg(&ab, msg_type, > > loginuid, sessionid, sid, > > - NULL); > > + current->audit_context); > > > > if (msg_type != AUDIT_USER_TTY) > > audit_log_format(ab, " msg='%.1024s'", > > @@ -758,6 +806,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > > } > > audit_set_pid(ab, NETLINK_CB(skb).portid); > > audit_log_end(ab); > > + audit_log_namespaces(current, skb); > > } > > break; > > case AUDIT_ADD: > > -- > Linux-audit mailing list > Linux-audit-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org > https://www.redhat.com/mailman/listinfo/linux-audit - RGB -- Richard Guy Briggs <rbriggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-18 22:16 ` Eric W. Biederman [not found] ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Eric W. Biederman @ 2013-03-18 22:16 UTC (permalink / raw) To: Aristeu Rozanski Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Adding the containers list so folks with container expertise can see what is being proposed. Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > This patchset introduces a new audit record to follow all USER records which > provides namespace information of the process. The idea is to allow processes > in containers to create records in the host system while providing means to be > filtered out. It looks like this mechanism makes it easy for an unprivileged program to spam and overwhelm the audit log. > For each new namespace, a unique procfs inode number is allocated and this > number has been used by userspace to determine which processes belong to the > same namespace. These numbers are used in the new audit record. > > Applications such as libvirt-sandbox and lxc can then report the same numbers > when a container is created and destroyed allowing to map records to a certain > container. Maybe the next step would be having a record for whenever a new > namespace is created? > > First 6 patches are needed in order to get each namespace's inode number. Grumble the existing methods can be used you don't have to introduce a whole new set of methods. Grumble. Besides the bug of assuming that the inodes now and forever will be the same across all instances of proc. > Patch 7 properly defines the new record that is related to the USER > record Not agmenting the current user records seems a little odd to me. You also continue in this my current policy of not allowing any audit records in the container itself, so I a don't quite know what the point of all of this is. > Patch 8 allows USER records to be generated from different namespaces Which essentially allows any user to create any USER record they want whenever they want. > Here's an example of output: > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' Ok. This seems totally bizarre. You are running a container with a user namespace with some uid mapped to uid 0? That defeats about half the point of having user namespaces, as half the files in the world are owned by uid 0, and can be written by uid 0 outside of your user namespace. Hmm. I need to look at this in a little more detail but I believe our use of task_pid_vnr here in the audit record is a long standing bug. > type=UNKNOWN[1327] msg=audit(1363528861.403:311): mnt=4026531840 net=4026531956 uts=4026531838 ipc=4026531839 pid=4026531836 user=4026531837 > > Notes: > - this is a RFC, all sorts of feedback are much appreciated > - while the last patch allows a new userns to send audit records, I haven't > look yet on making sure it has proper capabilities so regular users' > containers can create records I don't think it does. > - the record number allocated is just a draft. If this patchset evolves into > something that can be merged, please advise which number number is the best > choice > - I'm not subscribed to the list, so please make sure I'm on the Cc list > > fs/namespace.c | 14 +++++++ > include/linux/ipc_namespace.h | 1 > include/linux/mnt_namespace.h | 2 + > include/linux/pid_namespace.h | 1 > include/linux/user_namespace.h | 1 > include/linux/utsname.h | 1 > include/net/net_namespace.h | 1 > include/uapi/linux/audit.h | 1 > ipc/namespace.c | 14 +++++++ > kernel/audit.c | 76 +++++++++++++++++++++++++++++++++++++---- > kernel/pid_namespace.c | 11 +++++ > kernel/user_namespace.c | 5 ++ > kernel/utsname.c | 14 +++++++ > net/core/net_namespace.c | 14 +++++++ > 14 files changed, 150 insertions(+), 6 deletions(-) ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2013-03-19 12:24 ` Aristeu Rozanski [not found] ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Aristeu Rozanski @ 2013-03-19 12:24 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote: > Adding the containers list so folks with container expertise can see > what is being proposed. > > Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > > > This patchset introduces a new audit record to follow all USER records which > > provides namespace information of the process. The idea is to allow processes > > in containers to create records in the host system while providing means to be > > filtered out. > > It looks like this mechanism makes it easy for an unprivileged program > to spam and overwhelm the audit log. > > > For each new namespace, a unique procfs inode number is allocated and this > > number has been used by userspace to determine which processes belong to the > > same namespace. These numbers are used in the new audit record. > > > > Applications such as libvirt-sandbox and lxc can then report the same numbers > > when a container is created and destroyed allowing to map records to a certain > > container. Maybe the next step would be having a record for whenever a new > > namespace is created? > > > > First 6 patches are needed in order to get each namespace's inode number. > > Grumble the existing methods can be used you don't have to introduce a > whole new set of methods. Grumble. Besides the bug of assuming that > the inodes now and forever will be the same across all instances of > proc. the existing methods are for procfs use and I didn't want to abuse it. like I said the other email, the fact that it's not a reliable way to indefinitely describe a namespace due to multiple procfs instances or migration, the whole idea is flawed. > > Patch 7 properly defines the new record that is related to the USER > > record > > Not agmenting the current user records seems a little odd to me. > > You also continue in this my current policy of not allowing any audit > records in the container itself, so I a don't quite know what the point > of all of this is. your current policy wasn't known to me and /* Only support the initial namespaces for now. */ sounds like something that didn't happen for other reasons > > Patch 8 allows USER records to be generated from different namespaces > > Which essentially allows any user to create any USER record they want > whenever they want. > > > Here's an example of output: > > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' > > Ok. This seems totally bizarre. You are running a container with a > user namespace with some uid mapped to uid 0? on the notes section: - while the last patch allows a new userns to send audit records, I haven't look yet on making sure it has proper capabilities so regular users' containers can create records so I haven't tried it with userns. It's a RFC. That's a regular record to show the related records, using initial namespaces. like I stated in the email, I wasn't sure how I'd handle capabilities but the idea would be to allow containers to log to the system's auditd. since inode numbers aren't more reliable for more than a moment, I guess there's no other way than having an audit namespace and run an audit daemon inside the container (and communicate over the network like an individual host). -- Aristeu ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-20 0:00 ` Eric W. Biederman [not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 0 siblings, 1 reply; 20+ messages in thread From: Eric W. Biederman @ 2013-03-20 0:00 UTC (permalink / raw) To: Aristeu Rozanski Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > On Mon, Mar 18, 2013 at 03:16:52PM -0700, Eric W. Biederman wrote: >> Adding the containers list so folks with container expertise can see >> what is being proposed. >> >> Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: >> >> > This patchset introduces a new audit record to follow all USER records which >> > provides namespace information of the process. The idea is to allow processes >> > in containers to create records in the host system while providing means to be >> > filtered out. >> >> It looks like this mechanism makes it easy for an unprivileged program >> to spam and overwhelm the audit log. >> >> > For each new namespace, a unique procfs inode number is allocated and this >> > number has been used by userspace to determine which processes belong to the >> > same namespace. These numbers are used in the new audit record. >> > >> > Applications such as libvirt-sandbox and lxc can then report the same numbers >> > when a container is created and destroyed allowing to map records to a certain >> > container. Maybe the next step would be having a record for whenever a new >> > namespace is created? >> > >> > First 6 patches are needed in order to get each namespace's inode number. >> >> Grumble the existing methods can be used you don't have to introduce a >> whole new set of methods. Grumble. Besides the bug of assuming that >> the inodes now and forever will be the same across all instances of >> proc. > > the existing methods are for procfs use and I didn't want to abuse it. > like I said the other email, the fact that it's not a reliable way to > indefinitely describe a namespace due to multiple procfs instances or > migration, the whole idea is flawed. It is always possible to pick the instance of /proc connected to the initial pid namespace. And there is a device number you can use to say that. Usually designs that need global identifiers for namespaces suffer from the need for a namespace of namespaces (which we sort of have in /proc), and I push back by default to get people to think if what they are trying to do really makes sense. >> > Patch 7 properly defines the new record that is related to the USER >> > record >> >> Not agmenting the current user records seems a little odd to me. >> >> You also continue in this my current policy of not allowing any audit >> records in the container itself, so I a don't quite know what the point >> of all of this is. > > your current policy wasn't known to me and > /* Only support the initial namespaces for now. */ > sounds like something that didn't happen for other reasons The reasons were simply that to my knowledge no one has thought through how audit records and namespaces make sense to interact. My expectation would be that an extention of audit records would be logged on a per container basis. But I don't have any motivating examples. >> > Patch 8 allows USER records to be generated from different namespaces >> >> Which essentially allows any user to create any USER record they want >> whenever they want. >> >> > Here's an example of output: >> > type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' >> >> Ok. This seems totally bizarre. You are running a container with a >> user namespace with some uid mapped to uid 0? > > on the notes section: > - while the last patch allows a new userns to send audit records, I haven't > look yet on making sure it has proper capabilities so regular users' > containers can create records > > so I haven't tried it with userns. It's a RFC. I though you would have taken the time to run it at least once, or to perhaps have manually edited your example to see how things would fit together. > That's a regular record > to show the related records, using initial namespaces. like I stated in > the email, I wasn't sure how I'd handle capabilities but the idea would be > to allow containers to log to the system's auditd. since inode numbers > aren't more reliable for more than a moment, I guess there's no other > way than having an audit namespace and run an audit daemon inside the > container (and communicate over the network like an individual host). What was really missing from your RFC is a motivating example. I sort of see that in your paragraph above but it isn't clear to me. What is lost by not allowing USER audit records from processes in containers? What is gained by implementing user process to have them? And of course what are your thoughts on preventing unprivileged users overwhelming the audit subsystem. My minimal experience with the audit subsystem roughly feels like hardly anyone really cares. Although I may be wrong. Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2013-03-20 15:12 ` Serge Hallyn 2013-03-20 15:45 ` Aristeu Rozanski 1 sibling, 0 replies; 20+ messages in thread From: Serge Hallyn @ 2013-03-20 15:12 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org): > Aristeu Rozanski <arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > The reasons were simply that to my knowledge no one has thought through > how audit records and namespaces make sense to interact. It seems clear to me (perhaps wrongly :) that: 1. auditd is a host service only. 2. in cases where the namespace is hierarchical and resources have identifiers in the init namespace (i.e. pid and user ns), audit should simply, always, report the id in the init ns 3. in cases where namespaces are not hierarchical (ipc, netns) the (ns_id, resource_id) need to be dumped. The ns_id should be the inode # for the /proc/$$/ns/$namespace, since that is what is used for setns. Syslog I want eventually to be namespaced. Audit, not. Audit is (ISTM) about LSPP and such - things which we can't talk about in containers anyway. -serge ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 2013-03-20 15:12 ` Serge Hallyn @ 2013-03-20 15:45 ` Aristeu Rozanski [not found] ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 20+ messages in thread From: Aristeu Rozanski @ 2013-03-20 15:45 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris On Tue, Mar 19, 2013 at 05:00:50PM -0700, Eric W. Biederman wrote: > It is always possible to pick the instance of /proc connected to the > initial pid namespace. And there is a device number you can use to say > that. I wasn't aware of that, I'll take a look, thanks! > The reasons were simply that to my knowledge no one has thought through > how audit records and namespaces make sense to interact. > > My expectation would be that an extention of audit records would be > logged on a per container basis. But I don't have any motivating > examples. from what I've heard, there're two possibilites here: if a container is understood to be "light virtualization", it should behave just like another machine by having its own auditd daemon, sending records over the network to the host. If that's not the case, a single auditd must be present. But, the fact that you might want to run a sshd server inside a container it might be desirable to have USER_AUTH records for example. > I though you would have taken the time to run it at least once, or to > perhaps have manually edited your example to see how things would fit > together. I did run it with different namespaces but not with userns. The example was to show how the extra record would look like and I randomly picked one. The idea is that auditd will know which namespaces are the original ones and can use that to filter containers' records, which could be filtered out by default. > What was really missing from your RFC is a motivating example. I sort > of see that in your paragraph above but it isn't clear to me. > > What is lost by not allowing USER audit records from processes in > containers? What is gained by implementing user process to have them? > And of course what are your thoughts on preventing unprivileged users > overwhelming the audit subsystem. This is a bit fuzzy to me, perhaps due I'm not fully understanding userns implementation yet, so bear with me: I thought of changing so userns would not grant CAP_AUDIT_WRITE and CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require to be root on the init_ns). The 'init' process would start trusted daemons with those capabilities then drop the capabilities for everything else. Does it make sense? -- Aristeu ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-20 18:36 ` Serge Hallyn 2013-03-20 18:42 ` Eric Paris 0 siblings, 1 reply; 20+ messages in thread From: Serge Hallyn @ 2013-03-20 18:36 UTC (permalink / raw) To: Aristeu Rozanski Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman, Eric Paris Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > This is a bit fuzzy to me, perhaps due I'm not fully understanding > userns implementation yet, so bear with me: > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require Seems like CAP_AUDIT_WRITE should be targeted against the skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other capability. Last I knew (long time ago) you had to be in init_user_ns to talk audit, but that's ok - this would just do the right thing in any case. -serge ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 18:36 ` Serge Hallyn @ 2013-03-20 18:42 ` Eric Paris 2013-03-20 18:49 ` Serge Hallyn 0 siblings, 1 reply; 20+ messages in thread From: Eric Paris @ 2013-03-20 18:42 UTC (permalink / raw) To: Serge Hallyn Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > userns implementation yet, so bear with me: > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > Seems like CAP_AUDIT_WRITE should be targeted against the > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > capability. Last I knew (long time ago) you had to be in init_user_ns > to talk audit, but that's ok - this would just do the right thing in > any case. kauditd should be considered as existing in the init user namespace. So I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the init user namespace and if so, allow it to send messages. Who care what *ns the process exists in. If it has it in the init namespace, go ahead. Thus the process that created the container would need CAP_AUDIT_WRITE in the init namespace for this to all work, right? /me also gets so confused about what caps mean in the userns world. (/me has larger issues with the ns concept as a whole, but that boat sailed years and years ago) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 18:42 ` Eric Paris @ 2013-03-20 18:49 ` Serge Hallyn 2013-03-20 19:01 ` Eric Paris 0 siblings, 1 reply; 20+ messages in thread From: Serge Hallyn @ 2013-03-20 18:49 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > > userns implementation yet, so bear with me: > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > > > Seems like CAP_AUDIT_WRITE should be targeted against the > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > > capability. Last I knew (long time ago) you had to be in init_user_ns > > to talk audit, but that's ok - this would just do the right thing in > > any case. > > kauditd should be considered as existing in the init user namespace. So > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > init user namespace and if so, allow it to send messages. Who care what > *ns the process exists in. If it has it in the init namespace, go > ahead. Thus the process that created the container would need > CAP_AUDIT_WRITE in the init namespace for this to all work, right? Yes. What I was suggesting is intended to work if that situation ever changes. But I have zero complaints about doing it as you say, as I doubt it ever will/ought to change. That basically means CAP_AUDIT_WRITE would be worthless in a non-init userns. That's fine - at least the rules would be consistent. > /me also gets so confused about what caps mean in the userns world. If the resource in question (like a network interface) belongs to a namespace (netns) created by the userns in which the caller has the caps in question, then privilege is granted. Otherwise, not. What you're saying above about CAP_AUDIT_WRITE is exactly right (for how audit works right now). > (/me has larger issues with the ns concept as a whole, but that boat > sailed years and years ago) -serge ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 18:49 ` Serge Hallyn @ 2013-03-20 19:01 ` Eric Paris 2013-03-20 19:17 ` Aristeu Rozanski ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Eric Paris @ 2013-03-20 19:01 UTC (permalink / raw) To: Serge Hallyn Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > > > userns implementation yet, so bear with me: > > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > > > > > Seems like CAP_AUDIT_WRITE should be targeted against the > > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > > > capability. Last I knew (long time ago) you had to be in init_user_ns > > > to talk audit, but that's ok - this would just do the right thing in > > > any case. > > > > kauditd should be considered as existing in the init user namespace. So > > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > > init user namespace and if so, allow it to send messages. Who care what > > *ns the process exists in. If it has it in the init namespace, go > > ahead. Thus the process that created the container would need > > CAP_AUDIT_WRITE in the init namespace for this to all work, right? > > Yes. What I was suggesting is intended to work if that situation ever > changes. But I have zero complaints about doing it as you say, as I > doubt it ever will/ought to change. > > That basically means CAP_AUDIT_WRITE would be worthless in a non-init > userns. That's fine - at least the rules would be consistent. [veering away from this particular patch] We are also talking about adding a CAP_AUDIT_READ and sending messages via multicast on the audit socket. The problem is I don't know how the audit socket could work in the network namespace world. Right now kauditd has: audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); So there won't ever be anything on the kernel side of the audit socket in a non-init network namespace. Lets say that is fixed somehow (I assume it's possible? something? magic pixies?) I think we'd somehow need to do the CAP_AUDIT_READ check against the user namespace associated with the network namespace in question? But what messages should go to this userspace auditd? Going to have to have audit namespaces to. But only CAP_AUDIT_READ would make sense in the new audit namespace... /me wishes containers were a 'thing' instead of a bucket of semi-related nuts and bolts. -Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 19:01 ` Eric Paris @ 2013-03-20 19:17 ` Aristeu Rozanski 2013-03-20 19:19 ` Serge Hallyn 2013-03-20 23:23 ` Eric W. Biederman 2 siblings, 0 replies; 20+ messages in thread From: Aristeu Rozanski @ 2013-03-20 19:17 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, Serge Hallyn, Eric W. Biederman, linux-audit-H+wXaHxf7aLQT0dZR+AlfA On Wed, Mar 20, 2013 at 03:01:32PM -0400, Eric Paris wrote: > [veering away from this particular patch] > > We are also talking about adding a CAP_AUDIT_READ and sending messages > via multicast on the audit socket. The problem is I don't know how the > audit socket could work in the network namespace world. Right now > kauditd has: > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > So there won't ever be anything on the kernel side of the audit socket > in a non-init network namespace. Lets say that is fixed somehow (I > assume it's possible? something? magic pixies?) I think we'd somehow > need to do the CAP_AUDIT_READ check against the user namespace > associated with the network namespace in question? But what messages > should go to this userspace auditd? > > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > would make sense in the new audit namespace... I guess that could be achieved by forcing creating a new network namespace at the same time you create a new audit namespace. any new network namespace created inside this new container would lose CAP_AUDIT_*. -- Aristeu ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 19:01 ` Eric Paris 2013-03-20 19:17 ` Aristeu Rozanski @ 2013-03-20 19:19 ` Serge Hallyn 2013-03-20 23:23 ` Eric W. Biederman 2 siblings, 0 replies; 20+ messages in thread From: Serge Hallyn @ 2013-03-20 19:19 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: > > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > > > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > > > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > > > > > userns implementation yet, so bear with me: > > > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > > > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > > > > > > > > Seems like CAP_AUDIT_WRITE should be targeted against the > > > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > > > > capability. Last I knew (long time ago) you had to be in init_user_ns > > > > to talk audit, but that's ok - this would just do the right thing in > > > > any case. > > > > > > kauditd should be considered as existing in the init user namespace. So > > > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > > > init user namespace and if so, allow it to send messages. Who care what > > > *ns the process exists in. If it has it in the init namespace, go > > > ahead. Thus the process that created the container would need > > > CAP_AUDIT_WRITE in the init namespace for this to all work, right? > > > > Yes. What I was suggesting is intended to work if that situation ever > > changes. But I have zero complaints about doing it as you say, as I > > doubt it ever will/ought to change. > > > > That basically means CAP_AUDIT_WRITE would be worthless in a non-init > > userns. That's fine - at least the rules would be consistent. > > [veering away from this particular patch] > > We are also talking about adding a CAP_AUDIT_READ and sending messages > via multicast on the audit socket. The problem is I don't know how the > audit socket could work in the network namespace world. Right now > kauditd has: > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > So there won't ever be anything on the kernel side of the audit socket > in a non-init network namespace. Right. > Lets say that is fixed somehow (I > assume it's possible? something? magic pixies?) I think we'd somehow > need to do the CAP_AUDIT_READ check against the user namespace > associated with the network namespace in question? But what messages > should go to this userspace auditd? Ones which pertain to resources in that userns. If we ever were to sprinkle that pixie dust, then we'd know how to do this as well :) > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > would make sense in the new audit namespace... It's not clear to me that an audit namespace is needed. The userns 'owns' other namespaces, so it seems like it should suffice for directing audit msgs. > /me wishes containers were a 'thing' instead of a bucket of semi-related > nuts and bolts. That sure would simplify things. However there definately are heavy users of individual namespaces - i.e. using thousands of network namespaces but no other namespaces. -serge ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-20 19:01 ` Eric Paris 2013-03-20 19:17 ` Aristeu Rozanski 2013-03-20 19:19 ` Serge Hallyn @ 2013-03-20 23:23 ` Eric W. Biederman [not found] ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> 2 siblings, 1 reply; 20+ messages in thread From: Eric W. Biederman @ 2013-03-20 23:23 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, Serge Hallyn, linux-audit-H+wXaHxf7aLQT0dZR+AlfA Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: >> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: >> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding >> > > > userns implementation yet, so bear with me: >> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and >> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require >> > > >> > > Seems like CAP_AUDIT_WRITE should be targeted against the >> > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other >> > > capability. Last I knew (long time ago) you had to be in init_user_ns >> > > to talk audit, but that's ok - this would just do the right thing in >> > > any case. >> > >> > kauditd should be considered as existing in the init user namespace. So >> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the >> > init user namespace and if so, allow it to send messages. Who care what >> > *ns the process exists in. If it has it in the init namespace, go >> > ahead. Thus the process that created the container would need >> > CAP_AUDIT_WRITE in the init namespace for this to all work, right? >> >> Yes. What I was suggesting is intended to work if that situation ever >> changes. But I have zero complaints about doing it as you say, as I >> doubt it ever will/ought to change. >> >> That basically means CAP_AUDIT_WRITE would be worthless in a non-init >> userns. That's fine - at least the rules would be consistent. > > [veering away from this particular patch] > > We are also talking about adding a CAP_AUDIT_READ and sending messages > via multicast on the audit socket. The problem is I don't know how the > audit socket could work in the network namespace world. Hmm. I don't quite know how CAP_AUDIT_READ could work. When delivering a message to a socket you really don't know who is on the other end. > Right now kauditd has: > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > So there won't ever be anything on the kernel side of the audit socket > in a non-init network namespace. Lets say that is fixed somehow (I > assume it's possible? something? magic pixies?) One socket for each network namespace... It is a pain but doable. > I think we'd somehow > need to do the CAP_AUDIT_READ check against the user namespace > associated with the network namespace in question? But what messages > should go to this userspace auditd? Messages generated by processes in that user namespace? > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > would make sense in the new audit namespace... Given the connection of audit and security I think if we add support for a non-global auditd the user namespace seems to fit. The user namespace is certainly where all of the security connected bits go. Architecturally it gets a little tricky as it seems to make sense to generate audit messages that make sense to the process receiving them, which would mean actually generating a different audit message for different receiving contexts. I find the auditsc code odd. We log file descriptor numbers when a file is mmaped? What is something so process relative good to anyone? On a slightly different tangent. Do we want to update the AUDIT_CAPSET message to report the user namespace whose caps we are changing or perhaps to surpress the message outside of the initial user namespace. Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
[parent not found: <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH RFC] audit: provide namespace information in user originated records [not found] ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org> @ 2013-03-21 1:46 ` Eric Paris 2013-03-21 2:21 ` Serge Hallyn 0 siblings, 1 reply; 20+ messages in thread From: Eric Paris @ 2013-03-21 1:46 UTC (permalink / raw) To: Eric W. Biederman Cc: Linux Containers, Serge Hallyn, linux-audit-H+wXaHxf7aLQT0dZR+AlfA On Wed, 2013-03-20 at 16:23 -0700, Eric W. Biederman wrote: > Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes: > > > On Wed, 2013-03-20 at 13:49 -0500, Serge Hallyn wrote: > >> Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > >> > On Wed, 2013-03-20 at 13:36 -0500, Serge Hallyn wrote: > >> > > Quoting Aristeu Rozanski (arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > >> > > > This is a bit fuzzy to me, perhaps due I'm not fully understanding > >> > > > userns implementation yet, so bear with me: > >> > > > I thought of changing so userns would not grant CAP_AUDIT_WRITE and > >> > > > CAP_AUDIT_CONTROL unless the process already has it (i.e. it'd require > >> > > > >> > > Seems like CAP_AUDIT_WRITE should be targeted against the > >> > > skb->netns->userns. Then CAP_AUDIT_WRITE can be treated like any other > >> > > capability. Last I knew (long time ago) you had to be in init_user_ns > >> > > to talk audit, but that's ok - this would just do the right thing in > >> > > any case. > >> > > >> > kauditd should be considered as existing in the init user namespace. So > >> > I'd think we'd want to check if the process had CAP_AUDIT_WRITE in the > >> > init user namespace and if so, allow it to send messages. Who care what > >> > *ns the process exists in. If it has it in the init namespace, go > >> > ahead. Thus the process that created the container would need > >> > CAP_AUDIT_WRITE in the init namespace for this to all work, right? > >> > >> Yes. What I was suggesting is intended to work if that situation ever > >> changes. But I have zero complaints about doing it as you say, as I > >> doubt it ever will/ought to change. > >> > >> That basically means CAP_AUDIT_WRITE would be worthless in a non-init > >> userns. That's fine - at least the rules would be consistent. > > > > [veering away from this particular patch] > > > > We are also talking about adding a CAP_AUDIT_READ and sending messages > > via multicast on the audit socket. The problem is I don't know how the > > audit socket could work in the network namespace world. > > Hmm. I don't quite know how CAP_AUDIT_READ could work. When delivering > a message to a socket you really don't know who is on the other end. > > > Right now kauditd has: > > > > audit_sock = netlink_kernel_create(&init_net, NETLINK_AUDIT, &cfg); > > > > So there won't ever be anything on the kernel side of the audit socket > > in a non-init network namespace. Lets say that is fixed somehow (I > > assume it's possible? something? magic pixies?) > > One socket for each network namespace... It is a pain but doable. > > > I think we'd somehow > > need to do the CAP_AUDIT_READ check against the user namespace > > associated with the network namespace in question? But what messages > > should go to this userspace auditd? > > Messages generated by processes in that user namespace? So the kernel socket(s) would be per network namespace, but we divide messages per user namespace? Which socket do I send them on, considering the possible crazy many<->many mappings between user and network namespaces. It all makes me cry a little. > > Going to have to have audit namespaces to. But only CAP_AUDIT_READ > > would make sense in the new audit namespace... > > Given the connection of audit and security I think if we add support for > a non-global auditd the user namespace seems to fit. The user namespace > is certainly where all of the security connected bits go. > > Architecturally it gets a little tricky as it seems to make sense to > generate audit messages that make sense to the process receiving them, > which would mean actually generating a different audit message for > different receiving contexts. Assuming as today, we only have 1 auditd and it is system wide. We just attach consistent identifiable information (aka proc inode number, which people already use) to the audit records (this patch only does user messages, but attaching to all messages needs to be done). Moving to multiple auditd's starts to get really hard, and we might not ever pursue it :) > I find the auditsc code odd. We log file descriptor numbers when a file > is mmaped? What is something so process relative good to anyone? When an earlier record showed that fd being opened? I dunno.... > On a slightly different tangent. Do we want to update the AUDIT_CAPSET > message to report the user namespace whose caps we are changing or > perhaps to surpress the message outside of the initial user namespace. The extension of Aris's patch to syscall audit instead of just userspace audit would take care of this. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-21 1:46 ` Eric Paris @ 2013-03-21 2:21 ` Serge Hallyn 2013-03-21 4:48 ` Eric W. Biederman 0 siblings, 1 reply; 20+ messages in thread From: Serge Hallyn @ 2013-03-21 2:21 UTC (permalink / raw) To: Eric Paris Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric W. Biederman Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): > So the kernel socket(s) would be per network namespace, but we divide > messages per user namespace? Which socket do I send them on, > considering the possible crazy many<->many mappings between user and > network namespaces. It all makes me cry a little. not many-many - each netns is owned by exactly one userns. The userns from which the netns was created. -serge ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH RFC] audit: provide namespace information in user originated records 2013-03-21 2:21 ` Serge Hallyn @ 2013-03-21 4:48 ` Eric W. Biederman 0 siblings, 0 replies; 20+ messages in thread From: Eric W. Biederman @ 2013-03-21 4:48 UTC (permalink / raw) To: Serge Hallyn Cc: Linux Containers, linux-audit-H+wXaHxf7aLQT0dZR+AlfA, Eric Paris Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org> writes: > Quoting Eric Paris (eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org): >> So the kernel socket(s) would be per network namespace, but we divide >> messages per user namespace? Which socket do I send them on, >> considering the possible crazy many<->many mappings between user and >> network namespaces. It all makes me cry a little. > > not many-many - each netns is owned by exactly one userns. The userns > from which the netns was created. Doh. I missed this question and I think I misunderstood when Eric Paris was talking about multicasting audit messages. If what we are really talking about is sending some audit messages to an auditd in a container what appears obvious to me is that we define a per user namespace capability something like CAP_AUDIT_CONTROL. That does most or all of what CAP_AUDIT_CONTROL does in the init user namespace. Especially capturing audit_pid and audit_nlk_portid to decide who to send the message to. Something like: struct audit_control { int initialized; pid_t pid; u32 nlk_portid; }; struct user_namespace { ... struct audit_contol audit; }; Then the transmission would be something like: struct user_namespace *user_ns = ...; for (;;) { if (ns->audit_pid) { err = netlink_unicast(ns->audit.sock, skb, ns->audit.nlk_portid, 0); } if (!ns->parent) break; ns = ns->parent; } If someone finds auditd interesting enough to do that work. In general I think it only makes sense if we can reuse the existing userspace auditd. Eric ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH RFC] audit: provide namespace information in user originated records @ 2013-03-18 15:45 Aristeu Rozanski 2013-03-18 15:45 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Aristeu Rozanski 0 siblings, 1 reply; 20+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit (re-sending this, linux-audit is members only it seems) This patchset introduces a new audit record to follow all USER records which provides namespace information of the process. The idea is to allow processes in containers to create records in the host system while providing means to be filtered out. For each new namespace, a unique procfs inode number is allocated and this number has been used by userspace to determine which processes belong to the same namespace. These numbers are used in the new audit record. Applications such as libvirt-sandbox and lxc can then report the same numbers when a container is created and destroyed allowing to map records to a certain container. Maybe the next step would be having a record for whenever a new namespace is created? First 6 patches are needed in order to get each namespace's inode number. Patch 7 properly defines the new record that is related to the USER record Patch 8 allows USER records to be generated from namespaces Here's an example of output: type=CRED_DISP msg=audit(1363528861.403:311): pid=20016 uid=0 auid=0 ses=45 subj=system_u:system_r:crond_t:s0-s0:c0.c1023 msg='op=PAM:setcred acct="root" exe="/usr/sbin/crond" hostname=? addr=? terminal=cron res=success' type=UNKNOWN[1327] msg=audit(1363528861.403:311): mnt=4026531840 net=4026531956 uts=4026531838 ipc=4026531839 pid=4026531836 user=4026531837 Notes: - this is a RFC, all sorts of feedback are much appreciated - while the last patch allows a new userns to send audit records, I haven't look yet on making sure it has proper capabilities so regular users' containers can create records - the record number allocated is just a draft. If this patchset evolves into something that can be merged, please advise which number number is the best choice fs/namespace.c | 14 +++++++ include/linux/ipc_namespace.h | 1 include/linux/mnt_namespace.h | 2 + include/linux/pid_namespace.h | 1 include/linux/user_namespace.h | 1 include/linux/utsname.h | 1 include/net/net_namespace.h | 1 include/uapi/linux/audit.h | 1 ipc/namespace.c | 14 +++++++ kernel/audit.c | 76 +++++++++++++++++++++++++++++++++++++---- kernel/pid_namespace.c | 11 +++++ kernel/user_namespace.c | 5 ++ kernel/utsname.c | 14 +++++++ net/core/net_namespace.c | 14 +++++++ 14 files changed, 150 insertions(+), 6 deletions(-) ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH RFC 7/8] audit: report namespace information along with USER events 2013-03-18 15:45 Aristeu Rozanski @ 2013-03-18 15:45 ` Aristeu Rozanski 0 siblings, 0 replies; 20+ messages in thread From: Aristeu Rozanski @ 2013-03-18 15:45 UTC (permalink / raw) To: linux-audit For userspace generated events, include a record with the namespace procfs inode numbers the process belongs to. This allows to track down and filter audit messages by userspace. Signed-off-by: Aristeu Rozanski <arozansk@redhat.com> --- include/uapi/linux/audit.h | 1 + kernel/audit.c | 51 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 51 insertions(+), 1 deletions(-) diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h index 9f096f1..3ec3ccb 100644 --- a/include/uapi/linux/audit.h +++ b/include/uapi/linux/audit.h @@ -106,6 +106,7 @@ #define AUDIT_NETFILTER_PKT 1324 /* Packets traversing netfilter chains */ #define AUDIT_NETFILTER_CFG 1325 /* Netfilter chain modifications */ #define AUDIT_SECCOMP 1326 /* Secure Computing event */ +#define AUDIT_USER_NAMESPACE 1327 /* Information about process' namespaces */ #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */ #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */ diff --git a/kernel/audit.c b/kernel/audit.c index 58db117..b17f9c0 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -62,6 +62,11 @@ #include <linux/freezer.h> #include <linux/tty.h> #include <linux/pid_namespace.h> +#include <linux/ipc_namespace.h> +#include <linux/mnt_namespace.h> +#include <linux/utsname.h> +#include <linux/user_namespace.h> +#include <net/net_namespace.h> #include "audit.h" @@ -641,6 +646,49 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type, return rc; } +#ifdef CONFIG_NAMESPACES +static int audit_log_namespaces(struct task_struct *tsk, + struct sk_buff *skb) +{ + struct audit_context *ctx = tsk->audit_context; + struct audit_buffer *ab; + + if (!audit_enabled) + return 0; + + ab = audit_log_start(ctx, GFP_KERNEL, AUDIT_USER_NAMESPACE); + if (unlikely(!ab)) + return -ENOMEM; + + audit_log_format(ab, "mnt=%u", mntns_get_inum(tsk)); +#ifdef CONFIG_NET_NS + audit_log_format(ab, " net=%u", netns_get_inum(tsk)); +#endif +#ifdef CONFIG_UTS_NS + audit_log_format(ab, " uts=%u", utsns_get_inum(tsk)); +#endif +#ifdef CONFIG_IPC_NS + audit_log_format(ab, " ipc=%u", ipcns_get_inum(tsk)); +#endif +#ifdef CONFIG_PID_NS + audit_log_format(ab, " pid=%u", pidns_get_inum(tsk)); +#endif +#ifdef CONFIG_USER_NS + audit_log_format(ab, " user=%u", userns_get_inum(tsk)); +#endif + audit_set_pid(ab, NETLINK_CB(skb).portid); + audit_log_end(ab); + + return 0; +} +#else +static inline int audit_log_namespaces(struct task_struct *tsk, + struct sk_buff *skb) +{ + return 0; +} +#endif + static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) { u32 seq, sid; @@ -741,7 +789,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) } audit_log_common_recv_msg(&ab, msg_type, loginuid, sessionid, sid, - NULL); + current->audit_context); if (msg_type != AUDIT_USER_TTY) audit_log_format(ab, " msg='%.1024s'", @@ -758,6 +806,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) } audit_set_pid(ab, NETLINK_CB(skb).portid); audit_log_end(ab); + audit_log_namespaces(current, skb); } break; case AUDIT_ADD: -- 1.7.1 ^ permalink raw reply related [flat|nested] 20+ messages in thread
end of thread, other threads:[~2014-01-24 6:19 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1363619405-6419-1-git-send-email-arozansk@redhat.com>
[not found] ` <1363619405-6419-9-git-send-email-arozansk@redhat.com>
2013-03-18 21:28 ` [PATCH RFC 8/8] audit: allow user records to be created inside a container Eric W. Biederman
[not found] ` <1363619405-6419-8-git-send-email-arozansk@redhat.com>
2013-03-18 21:44 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Eric W. Biederman
2013-03-19 12:08 ` Aristeu Rozanski
[not found] ` <871ubc9yda.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2014-01-24 6:19 ` Richard Guy Briggs
[not found] ` <1363619405-6419-1-git-send-email-arozansk-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-18 22:16 ` [PATCH RFC] audit: provide namespace information in user originated records Eric W. Biederman
[not found] ` <877gl48iaz.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-19 12:24 ` Aristeu Rozanski
[not found] ` <20130319122408.GC20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20 0:00 ` Eric W. Biederman
[not found] ` <874ng7gcst.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-20 15:12 ` Serge Hallyn
2013-03-20 15:45 ` Aristeu Rozanski
[not found] ` <20130320154503.GF20187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-03-20 18:36 ` Serge Hallyn
2013-03-20 18:42 ` Eric Paris
2013-03-20 18:49 ` Serge Hallyn
2013-03-20 19:01 ` Eric Paris
2013-03-20 19:17 ` Aristeu Rozanski
2013-03-20 19:19 ` Serge Hallyn
2013-03-20 23:23 ` Eric W. Biederman
[not found] ` <87y5dh8xl7.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2013-03-21 1:46 ` Eric Paris
2013-03-21 2:21 ` Serge Hallyn
2013-03-21 4:48 ` Eric W. Biederman
2013-03-18 15:45 Aristeu Rozanski
2013-03-18 15:45 ` [PATCH RFC 7/8] audit: report namespace information along with USER events Aristeu Rozanski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox