* Re: [PATCH] BSD Jail LSM (2/3)
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (2/3) Serge Hallyn
@ 2004-09-10 19:31 ` Alan Cox
2004-09-12 23:33 ` Serge E. Hallyn
2004-09-12 21:12 ` [PATCH] BSD Jail LSM (2/3) Herbert Poetzl
1 sibling, 1 reply; 12+ messages in thread
From: Alan Cox @ 2004-09-10 19:31 UTC (permalink / raw)
To: Serge Hallyn; +Cc: Chris Wright, Linux Kernel Mailing List, akpm
On Gwe, 2004-09-10 at 21:23, Serge Hallyn wrote:
> Attached is a patch against the security Kconfig and Makefile to support
> bsdjail, as well as the bsdjail.c file itself. bsdjail offers
> functionality similar to (but more limited than) the vserver patch.
Looking over the code the first question I would ask is that it supports
AF_INET but not AF_INET6. That seems a bit limited in todays internet
environment.
> A process in a jail lives under a chroot which is not vulnerable to the
> well-known chdir(...)(etc)chroot(.) attack against normal chroots, and
> may be locked to one ip address. For additional features, please see
> Documentation/bsdjail.txt, which is included in the next patch.
You can break out with someone co-operating from outside the jail but
that I guess is pretty harmless anyway.
Alan
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] BSD Jail LSM (1/3)
@ 2004-09-10 20:21 Serge Hallyn
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (2/3) Serge Hallyn
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (3/3) Serge Hallyn
0 siblings, 2 replies; 12+ messages in thread
From: Serge Hallyn @ 2004-09-10 20:21 UTC (permalink / raw)
To: Chris Wright, linux-kernel; +Cc: akpm, serue
[-- Attachment #1: Type: text/plain, Size: 386 bytes --]
Attached is a patch which introduces a new LSM hook,
security_task_lookup. This hook allows an LSM to mediate visibility of
/proc/<pid> on a per-process level. It applies cleanly to 2.6.8.1 and
has been tested on xSeries, pSeries, and zSeries. The bsdjail lsm which
will be sent next is a user of this hook.
Please apply.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
-serge
[-- Attachment #2: tasklookup.diff --]
[-- Type: text/x-patch, Size: 2857 bytes --]
diff -Nru linux-2.6.8.1/fs/proc/base.c linux-2.6.8.1-jail/fs/proc/base.c
--- linux-2.6.8.1/fs/proc/base.c 2004-08-14 05:55:35.000000000 -0500
+++ linux-2.6.8.1-jail/fs/proc/base.c 2004-09-01 04:42:26.000000000 -0500
@@ -1679,6 +1679,8 @@
int tgid = p->pid;
if (!pid_alive(p))
continue;
+ if (security_task_lookup(p))
+ continue;
if (--index >= 0)
continue;
tgids[nr_tgids] = tgid;
diff -Nru linux-2.6.8.1/include/linux/security.h linux-2.6.8.1-jail/include/linux/security.h
--- linux-2.6.8.1/include/linux/security.h 2004-08-14 05:55:48.000000000 -0500
+++ linux-2.6.8.1-jail/include/linux/security.h 2004-09-01 04:42:26.000000000 -0500
@@ -627,6 +627,11 @@
* Set the security attributes in @p->security for a kernel thread that
* is being reparented to the init task.
* @p contains the task_struct for the kernel thread.
+ * @task_lookup:
+ * Check permission to see the /proc/<pid> entry for process @p.
+ * @p contains the task_struct for task <pid> which is being looked
+ * up under /proc
+ * return 0 if permission is granted.
* @task_to_inode:
* Set the security attributes for an inode based on an associated task's
* security attributes, e.g. for /proc/pid inodes.
@@ -1152,6 +1157,7 @@
unsigned long arg3, unsigned long arg4,
unsigned long arg5);
void (*task_reparent_to_init) (struct task_struct * p);
+ int (*task_lookup)(struct task_struct *p);
void (*task_to_inode)(struct task_struct *p, struct inode *inode);
int (*ipc_permission) (struct kern_ipc_perm * ipcp, short flag);
@@ -1751,6 +1757,11 @@
security_ops->task_reparent_to_init (p);
}
+static inline int security_task_lookup(struct task_struct *p)
+{
+ return security_ops->task_lookup(p);
+}
+
static inline void security_task_to_inode(struct task_struct *p, struct inode *inode)
{
security_ops->task_to_inode(p, inode);
@@ -2386,6 +2397,11 @@
cap_task_reparent_to_init (p);
}
+static inline int security_task_lookup(struct task_struct *p)
+{
+ return 0;
+}
+
static inline void security_task_to_inode(struct task_struct *p, struct inode *inode)
{ }
diff -Nru linux-2.6.8.1/security/dummy.c linux-2.6.8.1-jail/security/dummy.c
--- linux-2.6.8.1/security/dummy.c 2004-08-14 05:54:51.000000000 -0500
+++ linux-2.6.8.1-jail/security/dummy.c 2004-09-01 04:42:26.000000000 -0500
@@ -616,6 +616,11 @@
return;
}
+static int dummy_task_lookup(struct task_struct *p)
+{
+ return 0;
+}
+
static void dummy_task_to_inode(struct task_struct *p, struct inode *inode)
{ }
@@ -978,6 +983,7 @@
set_to_dummy_if_null(ops, task_kill);
set_to_dummy_if_null(ops, task_prctl);
set_to_dummy_if_null(ops, task_reparent_to_init);
+ set_to_dummy_if_null(ops, task_lookup);
set_to_dummy_if_null(ops, task_to_inode);
set_to_dummy_if_null(ops, ipc_permission);
set_to_dummy_if_null(ops, msg_msg_alloc_security);
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM (2/3)
2004-09-10 20:21 [PATCH] BSD Jail LSM (1/3) Serge Hallyn
@ 2004-09-10 20:23 ` Serge Hallyn
2004-09-10 19:31 ` Alan Cox
2004-09-12 21:12 ` [PATCH] BSD Jail LSM (2/3) Herbert Poetzl
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (3/3) Serge Hallyn
1 sibling, 2 replies; 12+ messages in thread
From: Serge Hallyn @ 2004-09-10 20:23 UTC (permalink / raw)
To: Chris Wright; +Cc: linux-kernel, akpm
[-- Attachment #1: Type: text/plain, Size: 649 bytes --]
Attached is a patch against the security Kconfig and Makefile to support
bsdjail, as well as the bsdjail.c file itself. bsdjail offers
functionality similar to (but more limited than) the vserver patch.
A process in a jail lives under a chroot which is not vulnerable to the
well-known chdir(...)(etc)chroot(.) attack against normal chroots, and
may be locked to one ip address. For additional features, please see
Documentation/bsdjail.txt, which is included in the next patch.
The patch applies cleanly to 2.6.8.1, and has been tested on xSeries,
pSeries, and zSeries.
Please apply.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
-serge
[-- Attachment #2: jail.diff --]
[-- Type: text/x-patch, Size: 36019 bytes --]
diff -Nru /home/hallyn/kernels/linux-2.6.8.1/security/bsdjail.c linux-2.6.8.1/security/bsdjail.c
--- /home/hallyn/kernels/linux-2.6.8.1/security/bsdjail.c 1969-12-31 18:00:00.000000000 -0600
+++ linux-2.6.8.1/security/bsdjail.c 2004-09-10 14:12:57.150691064 -0500
@@ -0,0 +1,1384 @@
+/*
+ * File: linux/security/bsdjail.c
+ * Author: Serge Hallyn (serue@us.ibm.com)
+ * Date: Sep 1, 2004
+ *
+ * (See Documentation/bsdjail.txt for more information)
+ *
+ * Copyright (C) 2004 International Business Machines <serue@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/security.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/proc_fs.h>
+#include <linux/in.h>
+#include <linux/pagemap.h>
+#include <linux/ip.h>
+#include <linux/mount.h>
+#include <asm/uaccess.h>
+#include <linux/netdevice.h>
+#include <linux/inetdevice.h>
+#include <linux/seq_file.h>
+#include <linux/un.h>
+#include <linux/smp_lock.h>
+#include <linux/kref.h>
+
+static int jail_debug = 0;
+MODULE_PARM(jail_debug, "i");
+MODULE_PARM_DESC(jail_debug, "Print bsd jail debugging messages.\n");
+
+#define DBG 0
+#define WARN 1
+#define bsdj_debug(how, fmt, arg... ) \
+ do { \
+ if ( how || jail_debug ) \
+ printk(KERN_NOTICE "%s: %s: " fmt, \
+ MY_NAME, __FUNCTION__, \
+ ## arg ); \
+ } while ( 0 )
+
+#define MY_NAME "bsdjail"
+
+/* flag to keep track of how we were registered */
+static int secondary = 0;
+
+/*
+ * The task structure holding jail information.
+ * Taskp->security points to one of these (or is null).
+ * There is exactly one jail_struct for each jail. If >1 process
+ * are in the same jail, they share the same jail_struct.
+ */
+struct jail_struct {
+ struct kref kref;
+
+ /* these are set on writes to /proc/<pid>/attr/exec */
+ char *root_pathname; /* char * containing path to use as jail / */
+ char *ip_addr_name; /* char * containing ip addr to use for jail */
+
+ /* these are set when a jail becomes active */
+ __u32 realaddr; /* internal form of ip_addr_name */
+ struct dentry *dentry; /* dentry of fs root */
+ struct vfsmount *mnt; /* vfsmnt of fs root */
+
+ /* Resource limits. 0 = no limit */
+ int max_nrtask; /* maximum number of tasks within this jail. */
+ int cur_nrtask; /* current number of tasks within this jail. */
+ long maxtimeslice; /* max timeslice in ms for procs in this jail */
+ long nice; /* nice level for processes in this jail */
+ long max_data, max_memlock; /* equivalent to RLIMIT_{DATA,MEMLOCK} */
+/* values for the jail_flags field */
+#define GOT_NETWORK 1 /* if not set, jail can use any valid net address */
+#define IN_USE 2 /* if 0, task is setting up jail, not yet in it */
+ char jail_flags;
+};
+
+#define in_use(x) (x->jail_flags & IN_USE)
+#define set_in_use(x) (x->jail_flags |= IN_USE)
+
+#define got_network(x) (x->jail_flags & GOT_NETWORK)
+#define set_got_network(x) (x->jail_flags |= GOT_NETWORK)
+#define unset_got_network(x) (x->jail_flags &= ~GOT_NETWORK)
+
+/*
+ * structs, defines, and functions to cope with stacking
+ */
+
+#define get_task_security(task) (task->security)
+#define get_inode_security(inode) (inode->i_security)
+#define get_sock_security(sock) (sock->sk_security)
+#define get_file_security(file) (file->f_security)
+#define get_ipc_security(ipc) (ipc->security)
+
+#define jail_of(proc) (get_task_security(proc))
+
+/*
+ * disable_jail: A jail which was in use, but has no references
+ * left, is disabled - we free up the mountpoint and dentry, and
+ * give up our reference on the module.
+ *
+ * don't need to put namespace, it will be done automatically
+ * when the last process in jail is put.
+ * DO need to put the dentry and vfsmount
+ */
+static void
+disable_jail(struct jail_struct *tsec)
+{
+ dput(tsec->dentry);
+ mntput(tsec->mnt);
+ module_put(THIS_MODULE);
+}
+
+
+static void free_jail(struct jail_struct *tsec)
+{
+ if (!tsec)
+ return;
+
+ if (tsec->root_pathname)
+ kfree(tsec->root_pathname);
+ if (tsec->ip_addr_name)
+ kfree(tsec->ip_addr_name);
+ kfree(tsec);
+}
+
+#define set_task_security(task,data) task->security = data
+#define set_inode_security(inode,data) inode->i_security = data
+#define set_sock_security(sock,data) sock->sk_security = data
+#define set_file_security(file,data) file->f_security = data
+#define set_ipc_security(ipc,data) ipc.security = data
+
+/*
+ * jail_task_free_security: this is the callback hooked into LSM.
+ * If there was no task->security field for bsdjail, do nothing.
+ * If there was, but it was never put into use, free the jail.
+ * If there was, and the jail is in use, then decrement the usage
+ * count, and disable and free the jail if the usage count hits 0.
+ */
+static void jail_task_free_security(struct task_struct *task)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_task_security(task);
+
+ if (!tsec)
+ return;
+
+ if (!in_use(tsec)) {
+ /*
+ * someone did 'echo -n x > /proc/<pid>/attr/exec' but
+ * then forked before execing. Nuke the old info.
+ */
+ free_jail(tsec);
+ set_task_security(task,NULL);
+ return;
+ }
+ tsec->cur_nrtask--;
+ /* If this was the last process in the jail, delete the jail */
+ kref_put(&tsec->kref);
+}
+
+static struct jail_struct *
+alloc_task_security(struct task_struct *tsk)
+{
+ struct jail_struct *tsec;
+ tsec = kmalloc(sizeof(struct jail_struct), GFP_KERNEL);
+ if (!tsec)
+ return ERR_PTR(-ENOMEM);
+ memset(tsec, 0, sizeof(struct jail_struct));
+ set_task_security(tsk, tsec);
+ return tsec;
+}
+
+static inline int
+in_jail(struct task_struct *t)
+{
+ struct jail_struct *tsec = jail_of(t);
+
+ if (tsec && in_use(tsec))
+ return 1;
+
+ return 0;
+}
+
+/*
+ * If a network address was passed into /proc/<pid>/attr/exec,
+ * then process in its jail will only be allowed to bind/listen
+ * to that address.
+ */
+void
+setup_netaddress(struct jail_struct *tsec)
+{
+ unsigned int a,b,c,d;
+
+ unset_got_network(tsec);
+ tsec->realaddr = 0;
+ if (!tsec->ip_addr_name)
+ return;
+
+ if (sscanf(tsec->ip_addr_name,"%u.%u.%u.%u",&a,&b,&c,&d)!=4)
+ return;
+ if (a>255 || b>255 || c>255 || d>255)
+ return;
+ tsec->realaddr = htonl((a<<24)|(b<<16)|(c<<8)|d);
+ set_got_network(tsec);
+ bsdj_debug(DBG, "Network set up (%s)\n", tsec->ip_addr_name);
+}
+
+/* release_jail:
+ * Callback for kref_put to use for releasing a jail when its
+ * last user exits.
+ */
+static void release_jail(struct kref *kref)
+{
+ struct jail_struct *tsec;
+
+ tsec = container_of(kref,struct jail_struct,kref);
+ disable_jail(tsec);
+ free_jail(tsec);
+}
+
+/*
+ * enable_jail:
+ * Called when a process is placed into a new jail to handle the
+ * actual creation of the jail.
+ * Creates namespace
+ * Sets process root+pwd
+ * Stores the requested ip address
+ * Registers a unique pseudo-proc filesystem for this jail
+ */
+int enable_jail(struct task_struct *tsk)
+{
+ struct nameidata nd;
+ struct jail_struct *tsec;
+ int retval = -EFAULT;
+
+ tsec = jail_of(tsk);
+ if (!tsec || !tsec->root_pathname)
+ goto out;
+
+ /*
+ * USE_JAIL_NAMESPACE: could be useful, so that future mounts outside
+ * the jail don't affect the jail. But it's not necessary, and
+ * requires exporting copy_namespace from fs/namespace.c
+ *
+ * Actually, it would also be useful for truly hiding
+ * information about mounts which do not exist in this jail.
+#define USE_JAIL_NAMESPACE
+ */
+#ifdef USE_JAIL_NAMESPACE
+ bsdj_debug(DBG, "bsdjail: copying namespace.\n");
+ retval = -EPERM;
+ if (copy_namespace(CLONE_NEWNS, tsk))
+ goto out;
+ bsdj_debug(DBG, "bsdjail: copied namespace.\n");
+#endif
+
+ /* find our new root directory */
+ bsdj_debug(DBG, "bsdjail: looking up %s\n", tsec->root_pathname);
+ retval = path_lookup(tsec->root_pathname, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &nd);
+ if (retval)
+ goto out;
+
+ bsdj_debug(DBG, "bsdjail: got %s, setting root to it\n", tsec->root_pathname);
+
+ /* and set the fsroot to it */
+ set_fs_root(tsk->fs, nd.mnt, nd.dentry);
+ set_fs_pwd(tsk->fs, nd.mnt, nd.dentry);
+
+ bsdj_debug(DBG, "bsdjail: root has been set. Have fun.\n");
+
+ /* set up networking */
+ if (tsec->ip_addr_name)
+ setup_netaddress(tsec);
+
+ tsec->cur_nrtask = 1;
+ if (tsec->nice)
+ set_user_nice(current, tsec->nice);
+ if (tsec->max_data) {
+ current->rlim[RLIMIT_DATA].rlim_cur = tsec->max_data;
+ current->rlim[RLIMIT_DATA].rlim_max = tsec->max_data;
+ }
+ if (tsec->max_memlock) {
+ current->rlim[RLIMIT_MEMLOCK].rlim_cur = tsec->max_memlock;
+ current->rlim[RLIMIT_MEMLOCK].rlim_max = tsec->max_memlock;
+ }
+ if (tsec->maxtimeslice) {
+ current->rlim[RLIMIT_CPU].rlim_cur = tsec->maxtimeslice;
+ current->rlim[RLIMIT_CPU].rlim_max = tsec->maxtimeslice;
+ }
+ /* success and end */
+ tsec->mnt = mntget(nd.mnt);
+ tsec->dentry = dget(nd.dentry);
+ path_release(&nd);
+ kref_init(&tsec->kref, release_jail);
+ set_in_use(tsec);
+
+ /* won't let ourselves be removed until this jail goes away */
+ try_module_get(THIS_MODULE);
+
+ return 0;
+
+out:
+ return retval;
+}
+
+/*
+ * LSM /proc/<pid>/attr hooks.
+ * You may write into /proc/<pid>/attr/exec:
+ * root /some/path
+ * ip 2.2.2.2
+ * These values will be used on the next exec() to set up your jail
+ * (assuming you're not already in a jail)
+ */
+static int
+jail_setprocattr(struct task_struct *p, char *name, void *value, size_t size)
+{
+ struct jail_struct *tsec = jail_of(current);
+ long val;
+ int start, len;
+
+ if (tsec && in_use(tsec))
+ return -EINVAL; /* let them guess why */
+
+ if (p != current || strcmp(name, "exec"))
+ return -EPERM;
+
+ if (strncmp(value, "root ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->root_pathname)
+ kfree(tsec->root_pathname);
+ start = 5;
+ len = size-start;
+ tsec->root_pathname = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->root_pathname)
+ return -ENOMEM;
+ strncpy(tsec->root_pathname, value+start, len);
+ tsec->root_pathname[len] = '\0';
+ } else if (strncmp(value, "ip ", 3)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->ip_addr_name)
+ kfree(tsec->ip_addr_name);
+ start = 3;
+ len = size-start;
+ tsec->ip_addr_name = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->ip_addr_name)
+ return -ENOMEM;
+ strncpy(tsec->ip_addr_name, value+start, len);
+ tsec->ip_addr_name[len] = '\0';
+
+ /* the next two are equivalent */
+ } else if (strncmp(value, "slice ", 6)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+6, NULL, 0);
+ tsec->maxtimeslice = val;
+ } else if (strncmp(value, "timeslice ", 10)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+10, NULL, 0);
+ tsec->maxtimeslice = val;
+ } else if (strncmp(value, "nrtask ", 7)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = (int) simple_strtol(value+7, NULL, 0);
+ if (val < 1)
+ return -EINVAL;
+ tsec->max_nrtask = val;
+ } else if (strncmp(value, "memlock ", 8)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+8, NULL, 0);
+ tsec->max_memlock = val;
+ } else if (strncmp(value, "data ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+5, NULL, 0);
+ tsec->max_data = val;
+ } else if (strncmp(value, "nice ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+5, NULL, 0);
+ tsec->nice = val;
+ } else
+ return -EINVAL;
+
+ return size;
+}
+
+static int print_jail_net_info(struct jail_struct *j, char *buf, int maxcnt)
+{
+ if (j->ip_addr_name)
+ return snprintf(buf, maxcnt, "%s\n", j->ip_addr_name);
+
+ return snprintf(buf, maxcnt, "No network information\n");
+}
+
+/*
+ * LSM /proc/<pid>/attr read hook.
+ *
+ * /proc/$$/attr/current output:
+ * If the reading process, say process 1001, is in a jail, then
+ * cat /proc/999/attr/current
+ * will print networking information.
+ * If the reading process, say process 1001, is not in a jail, then
+ * cat /proc/999/attr/current
+ * will return
+ * root: (root of jail)
+ * ip: (ip address of jail)
+ * if 999 is in a jail, or
+ * -EINVAL
+ * if 999 is not in a jail.
+ *
+ * /proc/$$/attr/exec output:
+ * A process in a jail gets -EINVAL for /proc/$$/attr/exec.
+ * A process not in a jail gets hints on starting a jail.
+ */
+static int
+jail_getprocattr(struct task_struct *p, char *name, void *value, size_t size)
+{
+ struct jail_struct *tsec;
+ int err = 0;
+
+ if (in_jail(current)) {
+ if (strcmp(name, "current")==0) {
+ /* provide network info */
+ err = print_jail_net_info(jail_of(current), value,
+ size);
+ return err;
+ }
+ return -EINVAL; /* let them guess why */
+ }
+
+ if (strcmp(name, "exec") == 0) {
+ /* Print usage some help */
+ err = snprintf(value, size,
+ "Valid keywords:\n"
+ "root <pathname>\n"
+ "ip <ip4-addr>\n"
+ "nrtask <max number of tasks in this jail>\n"
+ "nice <nice level for processes in this jail>\n"
+ "slice <max timeslice per process in msecs>\n"
+ "data <max data size per process in bytes>\n"
+ "memlock <max lockable memory per process in bytes>\n");
+ return err;
+ }
+
+ if (strcmp(name, "current"))
+ return -EPERM;
+
+ tsec = jail_of(p);
+ if (!tsec || !in_use(tsec)) {
+ err = snprintf(value, size, "Not Jailed\n");
+ } else {
+ err = snprintf(value, size,
+ "Root: %s\nIP: %s\n"
+ "max_nrtask %d current nrtask %d max_timeslice %lu "
+ "nice %lu\n"
+ "max_memlock %lu max_data %lu\n",
+ tsec->root_pathname,
+ tsec->ip_addr_name ? tsec->ip_addr_name : "(none)",
+ tsec->max_nrtask, tsec->cur_nrtask, tsec->maxtimeslice,
+ tsec->nice, tsec->max_data, tsec->max_memlock);
+ }
+
+ return err;
+}
+
+/*
+ * Forbid a process in a jail from sending a signal to a process in another
+ * (or no) jail through file sigio.
+ *
+ * We consider the process which set the fowner to be the one sending the
+ * signal, rather than the one writing to the file. Therefore we store the
+ * jail of a process during jail_file_set_fowner, then check that against
+ * the jail of the process receiving the signal.
+ */
+static int
+jail_file_send_sigiotask(struct task_struct *tsk, struct fown_struct *fown,
+ int fd, int reason)
+{
+ struct file *file;
+ struct jail_struct *tsec, *fsec;
+
+ if (!in_jail(current))
+ return 0;
+
+ file = (struct file *)((long)fown - offsetof(struct file,f_owner));
+ tsec = jail_of(tsk);
+ fsec = get_file_security(file);
+
+ if (fsec != tsec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_file_set_fowner(struct file *file)
+{
+ struct jail_struct *tsec;
+
+ tsec = jail_of(current);
+ set_file_security(file, tsec);
+ if (tsec)
+ kref_get(&tsec->kref);
+
+ return 0;
+}
+
+static void free_ipc_security(struct kern_ipc_perm *ipc)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_ipc_security(ipc);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_ipc_security((*ipc), NULL);
+}
+
+static void free_file_security(struct file *file)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_file_security(file);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_file_security(file, NULL);
+}
+
+static void free_inode_security(struct inode *inode)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_inode_security(inode);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_inode_security(inode, NULL);
+}
+
+/*
+ * LSM ptrace hook:
+ * process in jail may not ptrace process not in the same jail
+ */
+static int
+jail_ptrace (struct task_struct *tracer, struct task_struct *tracee)
+{
+ struct jail_struct *tsec = jail_of(tracer);
+
+ if (tsec && in_use(tsec)) {
+ if (tsec == jail_of(tracee))
+ return 0;
+ return -EPERM;
+ }
+ return 0;
+}
+
+
+#define loopbackaddr htonl((127 << 24) | 1)
+
+/*
+ * process in jail may only use one (aliased) ip address. If they try to
+ * attach to 127.0.0.1, that is remapped to their own address. If some
+ * other address (and not their own), deny permission
+ */
+static int jail_socket_unix_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen);
+
+static int
+jail_socket_bind(struct socket *sock, struct sockaddr *address, int addrlen)
+{
+ struct jail_struct *tsec = jail_of(current);
+ struct sockaddr_in *inaddr;
+ __u32 sin_addr, jailaddr;
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (sock->sk->sk_family == AF_UNIX)
+ return jail_socket_unix_bind(sock, address, addrlen);
+
+ if (address->sa_family != AF_INET)
+ return 0;
+
+ if (!got_network(tsec))
+ /* If we want to be strict, we could just
+ * deny net access when lacking a pseudo ip.
+ * For now we just allow it. */
+ return 0;
+
+ inaddr = (struct sockaddr_in *)address;
+ sin_addr = inaddr->sin_addr.s_addr;
+ jailaddr = tsec->realaddr;
+
+ if (sin_addr == jailaddr)
+ return 0;
+
+ if (sin_addr == loopbackaddr || !sin_addr) {
+ bsdj_debug(DBG, "Got a loopback or 0 address\n");
+ sin_addr = jailaddr;
+ bsdj_debug(DBG, "Converted to: %u.%u.%u.%u\n",
+ NIPQUAD(sin_addr));
+ return 0;
+ }
+
+ return -EPERM;
+}
+
+static void
+jail_socket_post_create(struct socket *sock, int family, int type,
+ int protocol, int kern)
+{
+ struct inet_opt *inet;
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec) || kern)
+ return;
+ if (!got_network(tsec))
+ return;
+
+ if (sock->sk->sk_family != AF_INET)
+ return;
+
+ inet = inet_sk(sock->sk);
+ inet->saddr = tsec->realaddr;
+
+ return;
+}
+
+static int
+jail_socket_listen(struct socket *sock, int backlog)
+{
+ struct inet_opt *inet;
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (!got_network(tsec))
+ return 0;
+
+ if (sock->sk->sk_family != AF_INET)
+ return 0;
+
+ inet = inet_sk(sock->sk);
+
+ if (inet->saddr == tsec->realaddr)
+ return 0;
+
+ return -EPERM;
+}
+
+static void free_sock_security(struct sock *sk)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_sock_security(sk);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_sock_security(sk, NULL);
+}
+
+/*
+ * The next three (socket) hooks prevent a process in a jail from sending
+ * data to a abstract unix domain socket which was bound outside the jail.
+ */
+static int
+jail_socket_unix_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen)
+{
+ struct sockaddr_un *sunaddr;
+ struct jail_struct *tsec;
+
+ if (sock->sk->sk_family != AF_UNIX)
+ return 0;
+
+ sunaddr = (struct sockaddr_un *)address;
+ if (sunaddr->sun_path[0] != 0)
+ return 0;
+
+ tsec = jail_of(current);
+ set_sock_security(sock->sk, tsec);
+ if (tsec)
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+/*
+ * Note - we deny sends both from unjailed to jailed, and from jailed
+ * to unjailed. As well as, of course between different jails.
+ */
+static int
+jail_socket_unix_may_send(struct socket *sock, struct socket *other)
+{
+ struct jail_struct *tsec, *ssec;
+
+ tsec = jail_of(current); /* jail of sending process */
+ ssec = get_sock_security(other->sk); /* jail of receiver */
+
+ if (tsec != ssec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_socket_unix_stream_connect(struct socket *sock,
+ struct socket *other, struct sock *newsk)
+{
+ struct jail_struct *tsec, *ssec;
+
+ tsec = jail_of(current); /* jail of sending process */
+ ssec = get_sock_security(other->sk); /* jail of receiver */
+
+ if (tsec != ssec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_mount(char * dev_name, struct nameidata *nd, char * type,
+ unsigned long flags, void * data)
+{
+ if (in_jail(current))
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_umount(struct vfsmount *mnt, int flags)
+{
+ if (in_jail(current))
+ return -EPERM;
+
+ return 0;
+}
+
+/*
+ * process in jail may not:
+ * use nice
+ * change network config
+ * load/unload modules
+ */
+static int
+jail_capable (struct task_struct *tsk, int cap)
+{
+ if (in_jail(tsk)) {
+ if (cap == CAP_SYS_NICE)
+ return -EPERM;
+ if (cap == CAP_NET_ADMIN)
+ return -EPERM;
+ if (cap == CAP_SYS_MODULE)
+ return -EPERM;
+ if (cap == CAP_SYS_RAWIO)
+ return -EPERM;
+ }
+
+ if (cap_is_fs_cap (cap) ? tsk->fsuid == 0 : tsk->euid == 0)
+ return 0;
+ return -EPERM;
+}
+
+/*
+ * jail_security_task_create:
+ *
+ * If the current process is ina a jail, and that jail is about to exceed a
+ * maximum number of processes, then refuse to fork. If the maximum number
+ * of jails is listed as 0, then there is no limit for this jail, and we allow
+ * all forks.
+ */
+static inline int
+jail_security_task_create (unsigned long clone_flags)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (tsec->max_nrtask && tsec->cur_nrtask >= tsec->max_nrtask)
+ return -EPERM;
+ return 0;
+}
+
+/*
+ * The child of a process in a jail belongs in the same jail
+ */
+static int
+jail_task_alloc_security(struct task_struct *tsk)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ set_task_security(tsk, tsec);
+ kref_get(&tsec->kref);
+ tsec->cur_nrtask++;
+ if (tsec->maxtimeslice) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->maxtimeslice;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->maxtimeslice;
+ }
+ if (tsec->max_data) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->max_data;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->max_data;
+ }
+ if (tsec->max_memlock) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->max_memlock;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->max_memlock;
+ }
+ if (tsec->nice)
+ set_user_nice(current, tsec->nice);
+
+ return 0;
+}
+
+static int
+jail_bprm_alloc_security(struct linux_binprm *bprm)
+{
+ struct jail_struct *tsec;
+ int ret;
+
+ tsec = jail_of(current);
+ if (!tsec)
+ return 0;
+
+ if (in_use(tsec))
+ return 0;
+
+ if (tsec->root_pathname) {
+ ret = enable_jail(current);
+ if (ret) {
+ /* if we failed, nix out the root/ip requests */
+ jail_task_free_security(current);
+ return ret;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Process in jail may not create devices
+ * Thanks to Brad Spender for pointing out fifos should be allowed.
+ */
+/* TODO: We may want to allow /dev/log, at least... */
+static int
+jail_inode_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+ if (!in_jail(current))
+ return 0;
+
+ if (S_ISFIFO(mode))
+ return 0;
+
+ return -EPERM;
+}
+
+/* yanked from fs/proc/base.c */
+static unsigned name_to_int(struct dentry *dentry)
+{
+ const char *name = dentry->d_name.name;
+ int len = dentry->d_name.len;
+ unsigned n = 0;
+
+ if (len > 1 && *name == '0')
+ goto out;
+ while (len-- > 0) {
+ unsigned c = *name++ - '0';
+ if (c > 9)
+ goto out;
+ if (n >= (~0U-9)/10)
+ goto out;
+ n *= 10;
+ n += c;
+ }
+ return n;
+out:
+ return ~0U;
+}
+
+/*
+ * jail_proc_inode_permission:
+ * called only when current is in a jail, and is trying to reach
+ * /proc/<pid>. We check whether <pid> is in the same jail as
+ * current. If not, permission is denied.
+ *
+ * NOTE: On the one hand, the task_to_inode(inode)->i_security
+ * approach seems cleaner, but on the other, this prevents us
+ * from unloading bsdjail for awhile...
+ */
+static int
+jail_proc_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct jail_struct *tsec = jail_of(current);
+ struct dentry *dentry = nd->dentry;
+ unsigned pid;
+
+ pid = name_to_int(dentry);
+ if (pid == ~0U) {
+ struct qstr *dname = &dentry->d_name;
+ if (strcmp(dname->name, "scsi")==0 ||
+ strcmp(dname->name, "sys")==0 ||
+ strcmp(dname->name, "ide")==0)
+ return -EPERM;
+ return 0;
+ }
+
+ if (dentry->d_parent != dentry->d_sb->s_root)
+ return 0;
+ if (get_inode_security(inode) != tsec)
+ return -ENOENT;
+
+ return 0;
+}
+
+/*
+ * Here is our attempt to prevent chroot escapes.
+ */
+static int
+is_jailroot_parent(struct dentry *candidate, struct dentry *root,
+ struct vfsmount *rootmnt)
+{
+ if (candidate == root)
+ return 0;
+
+ /* simple case: fs->root/.. == candidate */
+ if (root->d_parent == candidate)
+ return 1;
+
+ /*
+ * now more complicated: if fs->root is a mounted directory,
+ * then chdir(..) out of fs->root, at follow_dotdot, will follow
+ * the fs->root mount point. So we must check the parent dir of
+ * the fs->root mount point.
+ */
+ if (rootmnt->mnt_root == root && rootmnt->mnt_mountpoint!=root) {
+ root = rootmnt->mnt_mountpoint;
+ rootmnt = rootmnt->mnt_parent;
+ return is_jailroot_parent(candidate, root, rootmnt);
+ }
+
+ return 0;
+}
+
+/*
+ * A process in a jail may not see that /proc/<pid> exists for
+ * process not in its jail
+ * Unfortunately we can't pretend that pid for the starting process
+ * is 1, as vserver does.
+ */
+static int jail_task_lookup(struct task_struct *p)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec)
+ return 0;
+ if (tsec == jail_of(p))
+ return 0;
+ return -EPERM;
+}
+/*
+ * security_task_to_inode:
+ * Set inode->security = task's jail.
+ */
+static void jail_task_to_inode(struct task_struct *p, struct inode *inode)
+{
+ struct jail_struct *tsec = jail_of(p);
+
+ if (!tsec || !in_use(tsec))
+ return;
+ if (get_inode_security(inode))
+ return;
+ kref_get(&tsec->kref);
+ set_inode_security(inode, tsec);
+}
+
+/*
+ * inode_permission:
+ * If we are trying to look into certain /proc files from in a jail, we
+ * may deny permission.
+ * If we are trying to cd(..), but the cwd is the root of our jail, then
+ * permission is denied.
+ */
+static int
+jail_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (!nd)
+ return 0;
+
+ if (nd->dentry &&
+ strcmp(nd->dentry->d_sb->s_type->name, "proc")==0) {
+ return jail_proc_inode_permission(inode, mask, nd);
+
+ }
+
+ if (!(mask&MAY_EXEC))
+ return 0;
+ if (!inode || !S_ISDIR(inode->i_mode))
+ return 0;
+
+ if (is_jailroot_parent(nd->dentry, tsec->dentry, tsec->mnt)) {
+ bsdj_debug(WARN,"Attempt to chdir(..) out of jail!\n"
+ "(%s is a subdir of %s)\n",
+ tsec->dentry->d_name.name,
+ nd->dentry->d_name.name);
+ return -EPERM;
+ }
+
+ return 0;
+}
+
+/*
+ * A function which returns -ENOENT if dentry is the dentry for
+ * a /proc/<pid> directory. It returns 0 otherwise.
+ */
+static inline int
+generic_procpid_check(struct dentry *dentry)
+{
+ struct jail_struct *jail = jail_of(current);
+ unsigned pid = name_to_int(dentry);
+
+ if (!jail || !in_use(jail))
+ return 0;
+ if (pid == ~0U)
+ return 0;
+ if (strcmp(dentry->d_sb->s_type->name, "proc")!=0)
+ return 0;
+ if (dentry->d_parent != dentry->d_sb->s_root)
+ return 0;
+ if (get_inode_security(dentry->d_inode) != jail)
+ return -ENOENT;
+ return 0;
+}
+
+/*
+ * We want getattr to fail on /proc/<pid> to prevent leakage through, for
+ * instance, ls -d.
+ */
+static int
+jail_inode_getattr(struct vfsmount *mnt, struct dentry *dentry)
+{
+ return generic_procpid_check(dentry);
+}
+
+/* This probably is not necessary - /proc does not support xattrs? */
+static int
+jail_inode_getxattr(struct dentry *dentry, char *name)
+{
+ return generic_procpid_check(dentry);
+}
+
+/* process in jail may not send signal to process not in the same jail */
+static int
+jail_task_kill(struct task_struct *p, struct siginfo *info, int sig)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (tsec == jail_of(p))
+ return 0;
+
+ if (sig==SIGCHLD)
+ return 0;
+
+ return -EPERM;
+}
+
+/*
+ * LSM hooks to limit jailed process' abilities to muck with resource
+ * limits
+ */
+static int jail_task_setrlimit (unsigned int resource, struct rlimit *new_rlim)
+{
+ if (!in_jail(current))
+ return 0;
+
+ return -EPERM;
+}
+
+static int jail_task_setscheduler (struct task_struct *p, int policy,
+ struct sched_param *lp)
+{
+ if (!in_jail(current))
+ return 0;
+
+ return -EPERM;
+}
+
+/*
+ * LSM hooks to limit IPC access.
+ */
+
+static inline int
+basic_ipc_security_check(struct kern_ipc_perm *p, struct task_struct *target)
+{
+ struct jail_struct *tsec = jail_of(target);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (get_ipc_security(p) != tsec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_ipc_permission(struct kern_ipc_perm *ipcp, short flag)
+{
+ return basic_ipc_security_check(ipcp, current);
+}
+
+static int
+jail_shm_alloc_security (struct shmid_kernel *shp)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(shp->shm_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_shm_free_security (struct shmid_kernel *shp)
+{
+ free_ipc_security(&shp->shm_perm);
+}
+
+static int
+jail_shm_associate (struct shmid_kernel *shp, int shmflg)
+{
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_shm_shmctl(struct shmid_kernel *shp, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == SHM_INFO)
+ return 0;
+
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_shm_shmat(struct shmid_kernel *shp, char *shmaddr, int shmflg)
+{
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_msg_queue_alloc(struct msg_queue *msq)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(msq->q_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_msg_queue_free(struct msg_queue *msq)
+{
+ free_ipc_security(&msq->q_perm);
+}
+
+static int jail_msg_queue_associate(struct msg_queue *msq, int flag)
+{
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgctl(struct msg_queue *msq, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == MSG_INFO)
+ return 0;
+
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg, int msqflg)
+{
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgrcv(struct msg_queue *msq, struct msg_msg *msg,
+ struct task_struct *target, long type, int mode)
+
+{
+ return basic_ipc_security_check(&msq->q_perm, target);
+}
+
+static int
+jail_sem_alloc_security(struct sem_array *sma)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(sma->sem_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_sem_free_security(struct sem_array *sma)
+{
+ free_ipc_security(&sma->sem_perm);
+}
+
+static int
+jail_sem_associate(struct sem_array *sma, int semflg)
+{
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static int
+jail_sem_semctl(struct sem_array *sma, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == SEM_INFO)
+ return 0;
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static int
+jail_sem_semop(struct sem_array *sma, struct sembuf *sops, unsigned nsops,
+ int alter)
+{
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static struct security_operations bsdjail_security_ops = {
+ .ptrace = jail_ptrace,
+ .capable = jail_capable,
+
+ .task_kill = jail_task_kill,
+ .task_alloc_security = jail_task_alloc_security,
+ .task_free_security = jail_task_free_security,
+ .bprm_alloc_security = jail_bprm_alloc_security,
+ .task_create = jail_security_task_create,
+ .task_to_inode = jail_task_to_inode,
+ .task_lookup = jail_task_lookup,
+
+ .task_setrlimit = jail_task_setrlimit,
+ .task_setscheduler = jail_task_setscheduler,
+
+ .setprocattr = jail_setprocattr,
+ .getprocattr = jail_getprocattr,
+
+ .file_set_fowner = jail_file_set_fowner,
+ .file_send_sigiotask = jail_file_send_sigiotask,
+ .file_free_security = free_file_security,
+
+ .socket_bind = jail_socket_bind,
+ .socket_listen = jail_socket_listen,
+ .socket_post_create = jail_socket_post_create,
+ .unix_stream_connect = jail_socket_unix_stream_connect,
+ .unix_may_send = jail_socket_unix_may_send,
+ .sk_free_security = free_sock_security,
+
+ .inode_mknod = jail_inode_mknod,
+ .inode_permission = jail_inode_permission,
+ .inode_free_security = free_inode_security,
+ .inode_getattr = jail_inode_getattr,
+ .inode_getxattr = jail_inode_getxattr,
+ .sb_mount = jail_mount,
+ .sb_umount = jail_umount,
+
+ .ipc_permission = jail_ipc_permission,
+ .shm_alloc_security = jail_shm_alloc_security,
+ .shm_free_security = jail_shm_free_security,
+ .shm_associate = jail_shm_associate,
+ .shm_shmctl = jail_shm_shmctl,
+ .shm_shmat = jail_shm_shmat,
+
+ .msg_queue_alloc_security = jail_msg_queue_alloc,
+ .msg_queue_free_security = jail_msg_queue_free,
+ .msg_queue_associate = jail_msg_queue_associate,
+ .msg_queue_msgctl = jail_msg_queue_msgctl,
+ .msg_queue_msgsnd = jail_msg_queue_msgsnd,
+ .msg_queue_msgrcv = jail_msg_queue_msgrcv,
+
+ .sem_alloc_security = jail_sem_alloc_security,
+ .sem_free_security = jail_sem_free_security,
+ .sem_associate = jail_sem_associate,
+ .sem_semctl = jail_sem_semctl,
+ .sem_semop = jail_sem_semop,
+};
+
+static int __init bsdjail_init (void)
+{
+ int rc = 0;
+
+ if (register_security (&bsdjail_security_ops)) {
+ printk (KERN_INFO
+ "Failure registering BSD Jail module with the kernel\n");
+
+ rc = mod_reg_security(MY_NAME, &bsdjail_security_ops);
+ if (rc < 0) {
+ printk (KERN_INFO "Failure registering BSD Jail "
+ " module with primary security module.\n");
+ return -EINVAL;
+ }
+ secondary = 1;
+ }
+ printk (KERN_INFO "BSD Jail module initialized.\n");
+
+ return 0;
+}
+
+static void __exit bsdjail_exit (void)
+{
+ if (secondary) {
+ if (mod_unreg_security (MY_NAME, &bsdjail_security_ops))
+ printk (KERN_INFO "Failure unregistering BSD Jail "
+ " module with primary module.\n");
+ } else {
+ if (unregister_security (&bsdjail_security_ops)) {
+ printk (KERN_INFO "Failure unregistering BSD Jail "
+ "module with the kernel\n");
+ }
+ }
+
+ printk (KERN_INFO "BSD Jail module removed\n");
+}
+
+security_initcall (bsdjail_init);
+module_exit (bsdjail_exit);
+
+MODULE_DESCRIPTION("BSD Jail LSM.");
+MODULE_LICENSE("GPL");
diff -Nru /home/hallyn/kernels/linux-2.6.8.1/security/Kconfig linux-2.6.8.1/security/Kconfig
--- /home/hallyn/kernels/linux-2.6.8.1/security/Kconfig 2004-08-14 05:55:47.000000000 -0500
+++ linux-2.6.8.1/security/Kconfig 2004-09-10 14:13:40.521097760 -0500
@@ -46,5 +46,16 @@
source security/selinux/Kconfig
+config SECURITY_BSDJAIL
+ tristate "BSD Jail LSM"
+ depends on SECURITY
+ select SECURITY_NETWORK
+ help
+ Provides BSD Jail compartmentalization functionality.
+ See Documentation/bsdjail.txt for more information and
+ usage instructions.
+
+ If you are unsure how to answer this question, answer N.
+
endmenu
diff -Nru /home/hallyn/kernels/linux-2.6.8.1/security/Makefile linux-2.6.8.1/security/Makefile
--- /home/hallyn/kernels/linux-2.6.8.1/security/Makefile 2004-08-14 05:55:48.000000000 -0500
+++ linux-2.6.8.1/security/Makefile 2004-09-10 14:09:02.630343576 -0500
@@ -15,3 +15,4 @@
obj-$(CONFIG_SECURITY_SELINUX) += selinux/built-in.o
obj-$(CONFIG_SECURITY_CAPABILITIES) += commoncap.o capability.o
obj-$(CONFIG_SECURITY_ROOTPLUG) += commoncap.o root_plug.o
+obj-$(CONFIG_SECURITY_BSDJAIL) += bsdjail.o
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM (3/3)
2004-09-10 20:21 [PATCH] BSD Jail LSM (1/3) Serge Hallyn
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (2/3) Serge Hallyn
@ 2004-09-10 20:23 ` Serge Hallyn
1 sibling, 0 replies; 12+ messages in thread
From: Serge Hallyn @ 2004-09-10 20:23 UTC (permalink / raw)
To: Chris Wright; +Cc: linux-kernel, akpm
[-- Attachment #1: Type: text/plain, Size: 142 bytes --]
Attached is a patch carrying the documentation for the bsdjail LSM.
Please apply.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
-serge
[-- Attachment #2: jail-doc.diff --]
[-- Type: text/x-patch, Size: 4130 bytes --]
diff -Nru /home/hallyn/kernels/linux-2.6.8.1/Documentation/bsdjail.txt linux-2.6.8.1/Documentation/bsdjail.txt
--- /home/hallyn/kernels/linux-2.6.8.1/Documentation/bsdjail.txt 1969-12-31 18:00:00.000000000 -0600
+++ linux-2.6.8.1/Documentation/bsdjail.txt 2004-09-10 14:12:59.163385088 -0500
@@ -0,0 +1,99 @@
+BSD Jail Linux Security Module
+Serge E. Hallyn <serue@us.ibm.com>
+
+Description:
+
+Implements a subset of the BSD Jail functionality as a Linux LSM.
+What is currently implemented:
+
+ If a proces is in a jail, it:
+
+ 1. Is locked under a chroot (as are all children) which is not
+ vulnerable to the well-known chdir(..)(etc)chroot(.) escape.
+ 2. Cannot mount or umount
+ 3. Cannot send signals outside of jail
+ 4. Cannot ptrace processes outside of jail
+ 5. Cannot create devices
+ 6. Cannot renice processes
+ 7. Cannot load or unload modules
+ 8. Cannot change network settings
+ 9. May be assigned a specific ip address which will be used
+ for all it's socket binds.
+ 10. Cannot see contents of /proc/<pid> entries of processes not in the
+ same jail. (We hide their existence for convenience's sake, but
+ their existance can still be detected using, for instance, statfs)
+ 11. Has no CAP_SYS_RAWIO capability (no ioperm/iopl)
+ 12. May not share IPC resources with processes outside its own jail.
+ 13. May find it's valid network address (if restricted) under
+ /proc/$$/attr/current.
+
+WARNINGS:
+The security of this module is very much dependent on the security
+of the rest of the system. You must carefully think through your
+use of the system.
+
+Some examples:
+ 1. If you leave /dev/hda1 in the jail, processes in the
+ jail can access that filesystem (i.e. /sbin/debugfs).
+ 2. If you provide root access within a jail, this can of
+ course be used to setuid binaries in the jail. Combined
+ with an unjailed regular user account, this gives jailed
+ users unjailed root access. (thanks to Brad Spender for
+ pointing this out). To protect against this, use jails
+ in private namespaces, with the jail filesystems mounted
+ ONLY within the jail namespaces. For instance:
+
+$ # (Make sure /dev/hdc5 is not mounted anywhere)
+$ new_namespace_shell /bin/bash
+$ mount /dev/hdc5 /opt
+$ mount -t proc proc /opt/proc
+$ echo -n "root /opt" > /proc/$$/attr/exec
+$ echo -n "ip 9.53.94.111" > /proc/$$/attr/exec
+$ exec /bin/sh
+$ sshd
+$ apachectl start
+$ exit
+
+How to use:
+ 1. modprobe bsdjail
+ [ 1.5 /sbin/ifconfig eth0:0 2.2.2.2;
+ 1.6 /sbin/route add -host 2.2.2.2 dev eth0:0
+ (optional) ]
+ 2. Make sure the root filesystem (ie /dev/hdc5) is not mounted
+ anywhere else.
+ 3. exec_private_namespace /bin/sh
+ 4. mount /dev/hdc5 /opt
+ 5. mount -t proc proc /opt/proc
+ 6. echo -n "root /opt" > /proc/$$/attr/exec
+ echo -n "ip 2.2.2.2" > /proc/$$/attr/exec (optional)
+ 7. exec /bin/sh
+ 8. sshd
+ 9. exit
+
+The new shell will now run in a private jail on the filesystem on
+/dev/hdc5. If proc has been mounted under /dev/hdc5, then a "ps -auxw"
+under the jailed shell will show only entries for processes started under
+that jail.
+
+If a private IP was specified for the jail, then
+ cat /proc/$$/attr/current
+will show the address for the private network device. Other network
+devices will be visible through /sbin/ifconfig -a, but not usable.
+
+If the reading process is not in a jail, then
+ cat /proc/$$/attr/current
+returns information about the root and ip * for the target process,
+or "Not Jailed" if the target process is not jailed.
+
+Cat /proc/$$/attr/exec gives a list of the valid keywords to cat into
+/proc/$$/attr/exec when starting a jail.
+
+Current valid keywords for creating a jail are:
+
+ root: Root of jail's fs
+ ip: Ip addr for this jail
+ nrtask: Number of tasks in this jail
+ nice: The nice level for this jail. (maybe should be min/max?)
+ slice: Max timeslice per process
+ data: Max size of DATA segment per process
+ memlock: Max size of memory which can be locked per process
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM (2/3)
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (2/3) Serge Hallyn
2004-09-10 19:31 ` Alan Cox
@ 2004-09-12 21:12 ` Herbert Poetzl
1 sibling, 0 replies; 12+ messages in thread
From: Herbert Poetzl @ 2004-09-12 21:12 UTC (permalink / raw)
To: Serge Hallyn; +Cc: Chris Wright, linux-kernel, akpm
Greetings Serge!
On Fri, Sep 10, 2004 at 03:23:07PM -0500, Serge Hallyn wrote:
> Attached is a patch against the security Kconfig and Makefile to support
> bsdjail, as well as the bsdjail.c file itself. bsdjail offers
> functionality similar to (but more limited than) the vserver patch.
>
> A process in a jail lives under a chroot which is not vulnerable to the
> well-known chdir(...)(etc)chroot(.) attack against normal chroots, and
> may be locked to one ip address. For additional features, please see
> Documentation/bsdjail.txt, which is included in the next patch.
sounds good, maybe linux-vserver and bsdjail can
share/utilize common code/functionality here?
(will have a look at the code soon)
also interresting enhancements might be
- private namespaces (linux-vserver uses them)
- certain virtualizations (loadavg, ...)
anyway, let me know if you are interested in
any cooperation ...
best,
Herbert
> The patch applies cleanly to 2.6.8.1, and has been tested on xSeries,
> pSeries, and zSeries.
>
> Please apply.
>
> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
>
> -serge
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM (2/3)
2004-09-10 19:31 ` Alan Cox
@ 2004-09-12 23:33 ` Serge E. Hallyn
2004-09-13 10:56 ` Alan Cox
0 siblings, 1 reply; 12+ messages in thread
From: Serge E. Hallyn @ 2004-09-12 23:33 UTC (permalink / raw)
To: Alan Cox; +Cc: Chris Wright, Linux Kernel Mailing List, akpm
[-- Attachment #1: Type: text/plain, Size: 887 bytes --]
Quoting Alan Cox (alan@lxorguk.ukuu.org.uk):
> On Gwe, 2004-09-10 at 21:23, Serge Hallyn wrote:
> > Attached is a patch against the security Kconfig and Makefile to support
> > bsdjail, as well as the bsdjail.c file itself. bsdjail offers
> > functionality similar to (but more limited than) the vserver patch.
>
> Looking over the code the first question I would ask is that it supports
Thank you for looking at it.
> AF_INET but not AF_INET6. That seems a bit limited in todays internet
> environment.
bsdjail.c in the attached version of jail.diff adds support for ipv6.
This was my first time using ipv6, so please let me know if I'm going
about it all wrong.
Right now one must choose between either an ipv4 or ipv6 interface.
Is typical ipv6 usage such that it would be preferable to be able to
specify one of each?
Compiles and tests on a Crusoe laptop.
thanks,
-serge
[-- Attachment #2: jail.diff --]
[-- Type: text/plain, Size: 39780 bytes --]
diff -Nrup /home/hallyn/kernel/linux-2.6.8.1/security/Kconfig linux-2.6.8.1/security/Kconfig
--- /home/hallyn/kernel/linux-2.6.8.1/security/Kconfig 2004-08-14 05:55:47.000000000 -0500
+++ linux-2.6.8.1/security/Kconfig 2004-09-10 17:50:12.000000000 -0500
@@ -46,5 +46,16 @@ config SECURITY_ROOTPLUG
source security/selinux/Kconfig
+config SECURITY_BSDJAIL
+ tristate "BSD Jail LSM"
+ depends on SECURITY
+ select SECURITY_NETWORK
+ help
+ Provides BSD Jail compartmentalization functionality.
+ See Documentation/bsdjail.txt for more information and
+ usage instructions.
+
+ If you are unsure how to answer this question, answer N.
+
endmenu
diff -Nrup /home/hallyn/kernel/linux-2.6.8.1/security/Makefile linux-2.6.8.1/security/Makefile
--- /home/hallyn/kernel/linux-2.6.8.1/security/Makefile 2004-08-14 05:55:48.000000000 -0500
+++ linux-2.6.8.1/security/Makefile 2004-09-10 17:50:12.000000000 -0500
@@ -15,3 +15,4 @@ obj-$(CONFIG_SECURITY) += security.o d
obj-$(CONFIG_SECURITY_SELINUX) += selinux/built-in.o
obj-$(CONFIG_SECURITY_CAPABILITIES) += commoncap.o capability.o
obj-$(CONFIG_SECURITY_ROOTPLUG) += commoncap.o root_plug.o
+obj-$(CONFIG_SECURITY_BSDJAIL) += bsdjail.o
diff -Nrup /home/hallyn/kernel/linux-2.6.8.1/security/bsdjail.c linux-2.6.8.1/security/bsdjail.c
--- /home/hallyn/kernel/linux-2.6.8.1/security/bsdjail.c 1969-12-31 18:00:00.000000000 -0600
+++ linux-2.6.8.1/security/bsdjail.c 2004-09-12 11:55:09.000000000 -0500
@@ -0,0 +1,1520 @@
+/*
+ * File: linux/security/bsdjail.c
+ * Author: Serge Hallyn (serue@us.ibm.com)
+ * Date: Sep 12, 2004
+ *
+ * (See Documentation/bsdjail.txt for more information)
+ *
+ * Copyright (C) 2004 International Business Machines <serue@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/security.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/proc_fs.h>
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/pagemap.h>
+#include <linux/ip.h>
+#include <net/ipv6.h>
+#include <linux/mount.h>
+#include <asm/uaccess.h>
+#include <linux/netdevice.h>
+#include <linux/inetdevice.h>
+#include <linux/seq_file.h>
+#include <linux/un.h>
+#include <linux/smp_lock.h>
+#include <linux/kref.h>
+
+static int jail_debug = 0;
+MODULE_PARM(jail_debug, "i");
+MODULE_PARM_DESC(jail_debug, "Print bsd jail debugging messages.\n");
+
+#define DBG 0
+#define WARN 1
+#define bsdj_debug(how, fmt, arg... ) \
+ do { \
+ if ( how || jail_debug ) \
+ printk(KERN_NOTICE "%s: %s: " fmt, \
+ MY_NAME, __FUNCTION__, \
+ ## arg ); \
+ } while ( 0 )
+
+#define MY_NAME "bsdjail"
+
+/* flag to keep track of how we were registered */
+static int secondary = 0;
+
+/*
+ * The task structure holding jail information.
+ * Taskp->security points to one of these (or is null).
+ * There is exactly one jail_struct for each jail. If >1 process
+ * are in the same jail, they share the same jail_struct.
+ */
+struct jail_struct {
+ struct kref kref;
+
+ /* these are set on writes to /proc/<pid>/attr/exec */
+ char *root_pathname; /* char * containing path to use as jail / */
+ char *ip_addr_name; /* char * containing ip addr to use for jail */
+
+ /* these are set when a jail becomes active */
+ union {
+ __u32 a4; /* internal form of ip_addr_name */
+ struct in6_addr a6;
+ } realaddr;
+ struct dentry *dentry; /* dentry of fs root */
+ struct vfsmount *mnt; /* vfsmnt of fs root */
+
+ /* Resource limits. 0 = no limit */
+ int max_nrtask; /* maximum number of tasks within this jail. */
+ int cur_nrtask; /* current number of tasks within this jail. */
+ long maxtimeslice; /* max timeslice in ms for procs in this jail */
+ long nice; /* nice level for processes in this jail */
+ long max_data, max_memlock; /* equivalent to RLIMIT_{DATA,MEMLOCK} */
+/* values for the jail_flags field */
+#define GOT_NETWORK 1 /* if not set, jail can use any valid net address */
+#define IN_USE 2 /* if 0, task is setting up jail, not yet in it */
+#define IS_IPV6 4 /* if 0, ipv4, else ipv6 */
+ char jail_flags;
+};
+
+#define in_use(x) (x->jail_flags & IN_USE)
+#define set_in_use(x) (x->jail_flags |= IN_USE)
+
+#define got_network(x) (x->jail_flags & GOT_NETWORK)
+#define set_got_network(x) (x->jail_flags |= GOT_NETWORK)
+#define unset_got_network(x) (x->jail_flags &= ~GOT_NETWORK)
+
+#define is_ipv4(x) (!(x->jail_flags & IS_IPV6))
+#define is_ipv6(x) (x->jail_flags & IS_IPV6)
+#define set_ipv4(x) (x->jail_flags &= ~IS_IPV6)
+#define set_ipv6(x) (x->jail_flags |= IS_IPV6)
+
+/*
+ * structs, defines, and functions to cope with stacking
+ */
+
+#define get_task_security(task) (task->security)
+#define get_inode_security(inode) (inode->i_security)
+#define get_sock_security(sock) (sock->sk_security)
+#define get_file_security(file) (file->f_security)
+#define get_ipc_security(ipc) (ipc->security)
+
+#define jail_of(proc) (get_task_security(proc))
+
+/*
+ * disable_jail: A jail which was in use, but has no references
+ * left, is disabled - we free up the mountpoint and dentry, and
+ * give up our reference on the module.
+ *
+ * don't need to put namespace, it will be done automatically
+ * when the last process in jail is put.
+ * DO need to put the dentry and vfsmount
+ */
+static void
+disable_jail(struct jail_struct *tsec)
+{
+ dput(tsec->dentry);
+ mntput(tsec->mnt);
+ module_put(THIS_MODULE);
+}
+
+
+static void free_jail(struct jail_struct *tsec)
+{
+ if (!tsec)
+ return;
+
+ if (tsec->root_pathname)
+ kfree(tsec->root_pathname);
+ if (tsec->ip_addr_name)
+ kfree(tsec->ip_addr_name);
+ kfree(tsec);
+}
+
+#define set_task_security(task,data) task->security = data
+#define set_inode_security(inode,data) inode->i_security = data
+#define set_sock_security(sock,data) sock->sk_security = data
+#define set_file_security(file,data) file->f_security = data
+#define set_ipc_security(ipc,data) ipc.security = data
+
+/*
+ * jail_task_free_security: this is the callback hooked into LSM.
+ * If there was no task->security field for bsdjail, do nothing.
+ * If there was, but it was never put into use, free the jail.
+ * If there was, and the jail is in use, then decrement the usage
+ * count, and disable and free the jail if the usage count hits 0.
+ */
+static void jail_task_free_security(struct task_struct *task)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_task_security(task);
+
+ if (!tsec)
+ return;
+
+ if (!in_use(tsec)) {
+ /*
+ * someone did 'echo -n x > /proc/<pid>/attr/exec' but
+ * then forked before execing. Nuke the old info.
+ */
+ free_jail(tsec);
+ set_task_security(task,NULL);
+ return;
+ }
+ tsec->cur_nrtask--;
+ /* If this was the last process in the jail, delete the jail */
+ kref_put(&tsec->kref);
+}
+
+static struct jail_struct *
+alloc_task_security(struct task_struct *tsk)
+{
+ struct jail_struct *tsec;
+ tsec = kmalloc(sizeof(struct jail_struct), GFP_KERNEL);
+ if (!tsec)
+ return ERR_PTR(-ENOMEM);
+ memset(tsec, 0, sizeof(struct jail_struct));
+ set_task_security(tsk, tsec);
+ return tsec;
+}
+
+static inline int
+in_jail(struct task_struct *t)
+{
+ struct jail_struct *tsec = jail_of(t);
+
+ if (tsec && in_use(tsec))
+ return 1;
+
+ return 0;
+}
+
+/*
+ * If a network address was passed into /proc/<pid>/attr/exec,
+ * then process in its jail will only be allowed to bind/listen
+ * to that address.
+ */
+void
+setup_netaddress(struct jail_struct *tsec)
+{
+ unsigned int a,b,c,d, i;
+ unsigned int x[8];
+
+ unset_got_network(tsec);
+ ipv6_addr_set(&tsec->realaddr.a6, 0, 0, 0, 0);
+ if (!tsec->ip_addr_name) {
+ printk(KERN_NOTICE "%s: exiting\n", __FUNCTION__);
+ return;
+ }
+
+ if (is_ipv6(tsec)) {
+ printk(KERN_NOTICE "%s: is ipv6\n", __FUNCTION__);
+ if (sscanf(tsec->ip_addr_name,"%x:%x:%x:%x:%x:%x:%x:%x",
+ &x[0], &x[1], &x[2], &x[3], &x[4], &x[5], &x[6],
+ &x[7]) != 8) {
+ printk(KERN_NOTICE "%s: bad ipv6 addr %s\n", __FUNCTION__,
+ tsec->ip_addr_name);
+ return;
+ }
+ for (i=0; i<8; i++) {
+ if (x[i] > 65535) {
+ printk("%s: %x > 65535 at %d\n", __FUNCTION__, x[i], i);
+ return;
+ }
+ tsec->realaddr.a6.in6_u.u6_addr16[i] = htons(x[i]);
+ }
+ } else {
+ if (sscanf(tsec->ip_addr_name,"%u.%u.%u.%u",&a,&b,&c,&d)!=4)
+ return;
+ if (a>255 || b>255 || c>255 || d>255)
+ return;
+ tsec->realaddr.a4 = htonl((a<<24)|(b<<16)|(c<<8)|d);
+ }
+ set_got_network(tsec);
+ bsdj_debug(DBG, "Network set up (%s)\n", tsec->ip_addr_name);
+}
+
+/* release_jail:
+ * Callback for kref_put to use for releasing a jail when its
+ * last user exits.
+ */
+static void release_jail(struct kref *kref)
+{
+ struct jail_struct *tsec;
+
+ tsec = container_of(kref,struct jail_struct,kref);
+ disable_jail(tsec);
+ free_jail(tsec);
+}
+
+/*
+ * enable_jail:
+ * Called when a process is placed into a new jail to handle the
+ * actual creation of the jail.
+ * Creates namespace
+ * Sets process root+pwd
+ * Stores the requested ip address
+ * Registers a unique pseudo-proc filesystem for this jail
+ */
+int enable_jail(struct task_struct *tsk)
+{
+ struct nameidata nd;
+ struct jail_struct *tsec;
+ int retval = -EFAULT;
+
+ tsec = jail_of(tsk);
+ if (!tsec || !tsec->root_pathname)
+ goto out;
+
+ /*
+ * USE_JAIL_NAMESPACE: could be useful, so that future mounts outside
+ * the jail don't affect the jail. But it's not necessary, and
+ * requires exporting copy_namespace from fs/namespace.c
+ *
+ * Actually, it would also be useful for truly hiding
+ * information about mounts which do not exist in this jail.
+#define USE_JAIL_NAMESPACE
+ */
+#ifdef USE_JAIL_NAMESPACE
+ bsdj_debug(DBG, "bsdjail: copying namespace.\n");
+ retval = -EPERM;
+ if (copy_namespace(CLONE_NEWNS, tsk))
+ goto out;
+ bsdj_debug(DBG, "bsdjail: copied namespace.\n");
+#endif
+
+ /* find our new root directory */
+ bsdj_debug(DBG, "bsdjail: looking up %s\n", tsec->root_pathname);
+ retval = path_lookup(tsec->root_pathname, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &nd);
+ if (retval)
+ goto out;
+
+ bsdj_debug(DBG, "bsdjail: got %s, setting root to it\n", tsec->root_pathname);
+
+ /* and set the fsroot to it */
+ set_fs_root(tsk->fs, nd.mnt, nd.dentry);
+ set_fs_pwd(tsk->fs, nd.mnt, nd.dentry);
+
+ bsdj_debug(DBG, "bsdjail: root has been set. Have fun.\n");
+
+ /* set up networking */
+ if (tsec->ip_addr_name)
+ setup_netaddress(tsec);
+
+ tsec->cur_nrtask = 1;
+ if (tsec->nice)
+ set_user_nice(current, tsec->nice);
+ if (tsec->max_data) {
+ current->rlim[RLIMIT_DATA].rlim_cur = tsec->max_data;
+ current->rlim[RLIMIT_DATA].rlim_max = tsec->max_data;
+ }
+ if (tsec->max_memlock) {
+ current->rlim[RLIMIT_MEMLOCK].rlim_cur = tsec->max_memlock;
+ current->rlim[RLIMIT_MEMLOCK].rlim_max = tsec->max_memlock;
+ }
+ if (tsec->maxtimeslice) {
+ current->rlim[RLIMIT_CPU].rlim_cur = tsec->maxtimeslice;
+ current->rlim[RLIMIT_CPU].rlim_max = tsec->maxtimeslice;
+ }
+ /* success and end */
+ tsec->mnt = mntget(nd.mnt);
+ tsec->dentry = dget(nd.dentry);
+ path_release(&nd);
+ kref_init(&tsec->kref, release_jail);
+ set_in_use(tsec);
+
+ /* won't let ourselves be removed until this jail goes away */
+ try_module_get(THIS_MODULE);
+
+ return 0;
+
+out:
+ return retval;
+}
+
+/*
+ * LSM /proc/<pid>/attr hooks.
+ * You may write into /proc/<pid>/attr/exec:
+ * root /some/path
+ * ip 2.2.2.2
+ * These values will be used on the next exec() to set up your jail
+ * (assuming you're not already in a jail)
+ */
+static int
+jail_setprocattr(struct task_struct *p, char *name, void *value, size_t size)
+{
+ struct jail_struct *tsec = jail_of(current);
+ long val;
+ int start, len;
+
+ if (tsec && in_use(tsec))
+ return -EINVAL; /* let them guess why */
+
+ if (p != current || strcmp(name, "exec"))
+ return -EPERM;
+
+ if (strncmp(value, "root ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->root_pathname)
+ kfree(tsec->root_pathname);
+ start = 5;
+ len = size-start;
+ tsec->root_pathname = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->root_pathname)
+ return -ENOMEM;
+ strncpy(tsec->root_pathname, value+start, len);
+ tsec->root_pathname[len] = '\0';
+ } else if (strncmp(value, "ip ", 3)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->ip_addr_name)
+ kfree(tsec->ip_addr_name);
+ start = 3;
+ len = size-start;
+ tsec->ip_addr_name = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->ip_addr_name)
+ return -ENOMEM;
+ strncpy(tsec->ip_addr_name, value+start, len);
+ tsec->ip_addr_name[len] = '\0';
+ set_ipv4(tsec);
+ } else if (strncmp(value, "ip6 ", 4) == 0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->ip_addr_name)
+ kfree(tsec->ip_addr_name);
+ start = 4;
+ len = size-start;
+ tsec->ip_addr_name = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->ip_addr_name)
+ return -ENOMEM;
+ strncpy(tsec->ip_addr_name, value+start, len);
+ tsec->ip_addr_name[len] = '\0';
+ set_ipv6(tsec);
+
+ /* the next two are equivalent */
+ } else if (strncmp(value, "slice ", 6)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+6, NULL, 0);
+ tsec->maxtimeslice = val;
+ } else if (strncmp(value, "timeslice ", 10)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+10, NULL, 0);
+ tsec->maxtimeslice = val;
+ } else if (strncmp(value, "nrtask ", 7)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = (int) simple_strtol(value+7, NULL, 0);
+ if (val < 1)
+ return -EINVAL;
+ tsec->max_nrtask = val;
+ } else if (strncmp(value, "memlock ", 8)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+8, NULL, 0);
+ tsec->max_memlock = val;
+ } else if (strncmp(value, "data ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+5, NULL, 0);
+ tsec->max_data = val;
+ } else if (strncmp(value, "nice ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+5, NULL, 0);
+ tsec->nice = val;
+ } else
+ return -EINVAL;
+
+ return size;
+}
+
+static int print_jail_net_info(struct jail_struct *j, char *buf, int maxcnt)
+{
+ if (j->ip_addr_name)
+ return snprintf(buf, maxcnt, "%s\n", j->ip_addr_name);
+
+ return snprintf(buf, maxcnt, "No network information\n");
+}
+
+/*
+ * LSM /proc/<pid>/attr read hook.
+ *
+ * /proc/$$/attr/current output:
+ * If the reading process, say process 1001, is in a jail, then
+ * cat /proc/999/attr/current
+ * will print networking information.
+ * If the reading process, say process 1001, is not in a jail, then
+ * cat /proc/999/attr/current
+ * will return
+ * root: (root of jail)
+ * ip: (ip address of jail)
+ * if 999 is in a jail, or
+ * -EINVAL
+ * if 999 is not in a jail.
+ *
+ * /proc/$$/attr/exec output:
+ * A process in a jail gets -EINVAL for /proc/$$/attr/exec.
+ * A process not in a jail gets hints on starting a jail.
+ */
+static int
+jail_getprocattr(struct task_struct *p, char *name, void *value, size_t size)
+{
+ struct jail_struct *tsec;
+ int err = 0;
+
+ if (in_jail(current)) {
+ if (strcmp(name, "current")==0) {
+ /* provide network info */
+ err = print_jail_net_info(jail_of(current), value,
+ size);
+ return err;
+ }
+ return -EINVAL; /* let them guess why */
+ }
+
+ if (strcmp(name, "exec") == 0) {
+ /* Print usage some help */
+ err = snprintf(value, size,
+ "Valid keywords:\n"
+ "root <pathname>\n"
+ "ip <ip4-addr>\n"
+ "ip6 <ip6-addr>\n"
+ "nrtask <max number of tasks in this jail>\n"
+ "nice <nice level for processes in this jail>\n"
+ "slice <max timeslice per process in msecs>\n"
+ "data <max data size per process in bytes>\n"
+ "memlock <max lockable memory per process in bytes>\n");
+ return err;
+ }
+
+ if (strcmp(name, "current"))
+ return -EPERM;
+
+ tsec = jail_of(p);
+ if (!tsec || !in_use(tsec)) {
+ err = snprintf(value, size, "Not Jailed\n");
+ } else {
+ err = snprintf(value, size,
+ "Root: %s\nIP: %s\n"
+ "max_nrtask %d current nrtask %d max_timeslice %lu "
+ "nice %lu\n"
+ "max_memlock %lu max_data %lu\n",
+ tsec->root_pathname,
+ tsec->ip_addr_name ? tsec->ip_addr_name : "(none)",
+ tsec->max_nrtask, tsec->cur_nrtask, tsec->maxtimeslice,
+ tsec->nice, tsec->max_data, tsec->max_memlock);
+ }
+
+ return err;
+}
+
+/*
+ * Forbid a process in a jail from sending a signal to a process in another
+ * (or no) jail through file sigio.
+ *
+ * We consider the process which set the fowner to be the one sending the
+ * signal, rather than the one writing to the file. Therefore we store the
+ * jail of a process during jail_file_set_fowner, then check that against
+ * the jail of the process receiving the signal.
+ */
+static int
+jail_file_send_sigiotask(struct task_struct *tsk, struct fown_struct *fown,
+ int fd, int reason)
+{
+ struct file *file;
+ struct jail_struct *tsec, *fsec;
+
+ if (!in_jail(current))
+ return 0;
+
+ file = (struct file *)((long)fown - offsetof(struct file,f_owner));
+ tsec = jail_of(tsk);
+ fsec = get_file_security(file);
+
+ if (fsec != tsec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_file_set_fowner(struct file *file)
+{
+ struct jail_struct *tsec;
+
+ tsec = jail_of(current);
+ set_file_security(file, tsec);
+ if (tsec)
+ kref_get(&tsec->kref);
+
+ return 0;
+}
+
+static void free_ipc_security(struct kern_ipc_perm *ipc)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_ipc_security(ipc);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_ipc_security((*ipc), NULL);
+}
+
+static void free_file_security(struct file *file)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_file_security(file);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_file_security(file, NULL);
+}
+
+static void free_inode_security(struct inode *inode)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_inode_security(inode);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_inode_security(inode, NULL);
+}
+
+/*
+ * LSM ptrace hook:
+ * process in jail may not ptrace process not in the same jail
+ */
+static int
+jail_ptrace (struct task_struct *tracer, struct task_struct *tracee)
+{
+ struct jail_struct *tsec = jail_of(tracer);
+
+ if (tsec && in_use(tsec)) {
+ if (tsec == jail_of(tracee))
+ return 0;
+ return -EPERM;
+ }
+ return 0;
+}
+
+/*
+ * process in jail may only use one (aliased) ip address. If they try to
+ * attach to 127.0.0.1, that is remapped to their own address. If some
+ * other address (and not their own), deny permission
+ */
+static int jail_socket_unix_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen);
+
+#define loopbackaddr htonl((127 << 24) | 1)
+
+static inline int jail_inet4_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen, struct jail_struct *tsec)
+{
+ struct sockaddr_in *inaddr;
+ __u32 sin_addr, jailaddr;
+
+ if (is_ipv6(tsec))
+ return -EPERM;
+
+ inaddr = (struct sockaddr_in *)address;
+ sin_addr = inaddr->sin_addr.s_addr;
+ jailaddr = tsec->realaddr.a4;
+
+ if (sin_addr == jailaddr)
+ return 0;
+
+ if (sin_addr == loopbackaddr || !sin_addr) {
+ bsdj_debug(DBG, "Got a loopback or 0 address\n");
+ sin_addr = jailaddr;
+ bsdj_debug(DBG, "Converted to: %u.%u.%u.%u\n",
+ NIPQUAD(sin_addr));
+ return 0;
+ }
+
+ return -EPERM;
+}
+
+static inline int
+jail_inet6_bind(struct socket *sock, struct sockaddr *address, int addrlen,
+ struct jail_struct *tsec)
+{
+ struct sockaddr_in6 *inaddr6;
+ struct in6_addr *sin6_addr, *jailaddr;
+
+ printk(KERN_NOTICE "%s: 1\n", __FUNCTION__);
+ if (is_ipv4(tsec))
+ return -EPERM;
+
+ printk(KERN_NOTICE "%s: 2\n", __FUNCTION__);
+
+ inaddr6 = (struct sockaddr_in6 *)address;
+ sin6_addr = &inaddr6->sin6_addr;
+ jailaddr = &tsec->realaddr.a6;
+
+ if (ipv6_addr_cmp(jailaddr, sin6_addr)==0) {
+ printk(KERN_NOTICE "%s: allowing 1\n", __FUNCTION__);
+ return 0;
+ }
+
+ if (ipv6_addr_cmp(sin6_addr, &in6addr_loopback)==0) {
+ printk(KERN_NOTICE "%s: allowing 2\n", __FUNCTION__);
+ bsdj_debug(DBG, "Got a loopback or 0 address\n");
+ ipv6_addr_copy(sin6_addr, jailaddr);
+ return 0;
+ }
+ printk(KERN_NOTICE "%s: DENYING\n", __FUNCTION__);
+ printk(KERN_NOTICE "%s: a %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x "
+ "j %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+ __FUNCTION__,
+ NIP6(*sin6_addr),
+ NIP6(*jailaddr));
+
+ return -EPERM;
+}
+
+static int
+jail_socket_bind(struct socket *sock, struct sockaddr *address, int addrlen)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (sock->sk->sk_family == AF_UNIX)
+ return jail_socket_unix_bind(sock, address, addrlen);
+
+ if (!got_network(tsec))
+ /* If we want to be strict, we could just
+ * deny net access when lacking a pseudo ip.
+ * For now we just allow it. */
+ return 0;
+
+ if (address->sa_family == AF_INET)
+ return jail_inet4_bind(sock, address, addrlen, tsec);
+
+ if (address->sa_family == AF_INET6)
+ return jail_inet6_bind(sock, address, addrlen, tsec);
+
+ return 0;
+}
+
+/*
+ * If locked in an ipv6 jail, don't let them use ipv4, and vice versa
+ */
+static int
+jail_socket_create(int family, int type, int protocol, int kern)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec) || kern || !got_network(tsec))
+ return 0;
+
+ if (family == AF_INET) {
+ if (is_ipv4(tsec))
+ return 0;
+ return -EPERM;
+ }
+
+ if (family == AF_INET6) {
+ if (is_ipv6(tsec))
+ return 0;
+ return -EPERM;
+ }
+
+ return 0;
+}
+
+static void
+jail_socket_post_create(struct socket *sock, int family, int type,
+ int protocol, int kern)
+{
+ struct inet_opt *inet;
+ struct ipv6_pinfo *inet6;
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec) || kern)
+ return;
+ if (!got_network(tsec))
+ return;
+
+ if (family == AF_INET) {
+ inet = inet_sk(sock->sk);
+ inet->saddr = tsec->realaddr.a4;
+ } else if (family == AF_INET6) {
+ inet6 = inet6_sk(sock->sk);
+ ipv6_addr_copy(&inet6->saddr, &tsec->realaddr.a6);
+ }
+
+ return;
+}
+
+static int
+jail_socket_listen(struct socket *sock, int backlog)
+{
+ struct inet_opt *inet;
+ struct ipv6_pinfo *inet6;
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (!got_network(tsec))
+ return 0;
+
+ if (sock->sk->sk_family == AF_INET) {
+ inet = inet_sk(sock->sk);
+ if (inet->saddr == tsec->realaddr.a4)
+ return 0;
+ return -EPERM;
+ }
+ if (sock->sk->sk_family == AF_INET) {
+ inet6 = inet6_sk(sock->sk);
+ if (ipv6_addr_cmp(&inet6->saddr, &tsec->realaddr.a6)==0)
+ return 0;
+ return -EPERM;
+ }
+
+ return 0;
+}
+
+static void free_sock_security(struct sock *sk)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_sock_security(sk);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_sock_security(sk, NULL);
+}
+
+/*
+ * The next three (socket) hooks prevent a process in a jail from sending
+ * data to a abstract unix domain socket which was bound outside the jail.
+ */
+static int
+jail_socket_unix_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen)
+{
+ struct sockaddr_un *sunaddr;
+ struct jail_struct *tsec;
+
+ if (sock->sk->sk_family != AF_UNIX)
+ return 0;
+
+ sunaddr = (struct sockaddr_un *)address;
+ if (sunaddr->sun_path[0] != 0)
+ return 0;
+
+ tsec = jail_of(current);
+ set_sock_security(sock->sk, tsec);
+ if (tsec)
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+/*
+ * Note - we deny sends both from unjailed to jailed, and from jailed
+ * to unjailed. As well as, of course between different jails.
+ */
+static int
+jail_socket_unix_may_send(struct socket *sock, struct socket *other)
+{
+ struct jail_struct *tsec, *ssec;
+
+ tsec = jail_of(current); /* jail of sending process */
+ ssec = get_sock_security(other->sk); /* jail of receiver */
+
+ if (tsec != ssec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_socket_unix_stream_connect(struct socket *sock,
+ struct socket *other, struct sock *newsk)
+{
+ struct jail_struct *tsec, *ssec;
+
+ tsec = jail_of(current); /* jail of sending process */
+ ssec = get_sock_security(other->sk); /* jail of receiver */
+
+ if (tsec != ssec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_mount(char * dev_name, struct nameidata *nd, char * type,
+ unsigned long flags, void * data)
+{
+ if (in_jail(current))
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_umount(struct vfsmount *mnt, int flags)
+{
+ if (in_jail(current))
+ return -EPERM;
+
+ return 0;
+}
+
+/*
+ * process in jail may not:
+ * use nice
+ * change network config
+ * load/unload modules
+ */
+static int
+jail_capable (struct task_struct *tsk, int cap)
+{
+ if (in_jail(tsk)) {
+ if (cap == CAP_SYS_NICE)
+ return -EPERM;
+ if (cap == CAP_NET_ADMIN)
+ return -EPERM;
+ if (cap == CAP_SYS_MODULE)
+ return -EPERM;
+ if (cap == CAP_SYS_RAWIO)
+ return -EPERM;
+ }
+
+ if (cap_is_fs_cap (cap) ? tsk->fsuid == 0 : tsk->euid == 0)
+ return 0;
+ return -EPERM;
+}
+
+/*
+ * jail_security_task_create:
+ *
+ * If the current process is ina a jail, and that jail is about to exceed a
+ * maximum number of processes, then refuse to fork. If the maximum number
+ * of jails is listed as 0, then there is no limit for this jail, and we allow
+ * all forks.
+ */
+static inline int
+jail_security_task_create (unsigned long clone_flags)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (tsec->max_nrtask && tsec->cur_nrtask >= tsec->max_nrtask)
+ return -EPERM;
+ return 0;
+}
+
+/*
+ * The child of a process in a jail belongs in the same jail
+ */
+static int
+jail_task_alloc_security(struct task_struct *tsk)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ set_task_security(tsk, tsec);
+ kref_get(&tsec->kref);
+ tsec->cur_nrtask++;
+ if (tsec->maxtimeslice) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->maxtimeslice;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->maxtimeslice;
+ }
+ if (tsec->max_data) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->max_data;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->max_data;
+ }
+ if (tsec->max_memlock) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->max_memlock;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->max_memlock;
+ }
+ if (tsec->nice)
+ set_user_nice(current, tsec->nice);
+
+ return 0;
+}
+
+static int
+jail_bprm_alloc_security(struct linux_binprm *bprm)
+{
+ struct jail_struct *tsec;
+ int ret;
+
+ tsec = jail_of(current);
+ if (!tsec)
+ return 0;
+
+ if (in_use(tsec))
+ return 0;
+
+ if (tsec->root_pathname) {
+ ret = enable_jail(current);
+ if (ret) {
+ /* if we failed, nix out the root/ip requests */
+ jail_task_free_security(current);
+ return ret;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Process in jail may not create devices
+ * Thanks to Brad Spender for pointing out fifos should be allowed.
+ */
+/* TODO: We may want to allow /dev/log, at least... */
+static int
+jail_inode_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+ if (!in_jail(current))
+ return 0;
+
+ if (S_ISFIFO(mode))
+ return 0;
+
+ return -EPERM;
+}
+
+/* yanked from fs/proc/base.c */
+static unsigned name_to_int(struct dentry *dentry)
+{
+ const char *name = dentry->d_name.name;
+ int len = dentry->d_name.len;
+ unsigned n = 0;
+
+ if (len > 1 && *name == '0')
+ goto out;
+ while (len-- > 0) {
+ unsigned c = *name++ - '0';
+ if (c > 9)
+ goto out;
+ if (n >= (~0U-9)/10)
+ goto out;
+ n *= 10;
+ n += c;
+ }
+ return n;
+out:
+ return ~0U;
+}
+
+/*
+ * jail_proc_inode_permission:
+ * called only when current is in a jail, and is trying to reach
+ * /proc/<pid>. We check whether <pid> is in the same jail as
+ * current. If not, permission is denied.
+ *
+ * NOTE: On the one hand, the task_to_inode(inode)->i_security
+ * approach seems cleaner, but on the other, this prevents us
+ * from unloading bsdjail for awhile...
+ */
+static int
+jail_proc_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct jail_struct *tsec = jail_of(current);
+ struct dentry *dentry = nd->dentry;
+ unsigned pid;
+
+ pid = name_to_int(dentry);
+ if (pid == ~0U) {
+ struct qstr *dname = &dentry->d_name;
+ if (strcmp(dname->name, "scsi")==0 ||
+ strcmp(dname->name, "sys")==0 ||
+ strcmp(dname->name, "ide")==0)
+ return -EPERM;
+ return 0;
+ }
+
+ if (dentry->d_parent != dentry->d_sb->s_root)
+ return 0;
+ if (get_inode_security(inode) != tsec)
+ return -ENOENT;
+
+ return 0;
+}
+
+/*
+ * Here is our attempt to prevent chroot escapes.
+ */
+static int
+is_jailroot_parent(struct dentry *candidate, struct dentry *root,
+ struct vfsmount *rootmnt)
+{
+ if (candidate == root)
+ return 0;
+
+ /* simple case: fs->root/.. == candidate */
+ if (root->d_parent == candidate)
+ return 1;
+
+ /*
+ * now more complicated: if fs->root is a mounted directory,
+ * then chdir(..) out of fs->root, at follow_dotdot, will follow
+ * the fs->root mount point. So we must check the parent dir of
+ * the fs->root mount point.
+ */
+ if (rootmnt->mnt_root == root && rootmnt->mnt_mountpoint!=root) {
+ root = rootmnt->mnt_mountpoint;
+ rootmnt = rootmnt->mnt_parent;
+ return is_jailroot_parent(candidate, root, rootmnt);
+ }
+
+ return 0;
+}
+
+/*
+ * A process in a jail may not see that /proc/<pid> exists for
+ * process not in its jail
+ * Unfortunately we can't pretend that pid for the starting process
+ * is 1, as vserver does.
+ */
+static int jail_task_lookup(struct task_struct *p)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec)
+ return 0;
+ if (tsec == jail_of(p))
+ return 0;
+ return -EPERM;
+}
+/*
+ * security_task_to_inode:
+ * Set inode->security = task's jail.
+ */
+static void jail_task_to_inode(struct task_struct *p, struct inode *inode)
+{
+ struct jail_struct *tsec = jail_of(p);
+
+ if (!tsec || !in_use(tsec))
+ return;
+ if (get_inode_security(inode))
+ return;
+ kref_get(&tsec->kref);
+ set_inode_security(inode, tsec);
+}
+
+/*
+ * inode_permission:
+ * If we are trying to look into certain /proc files from in a jail, we
+ * may deny permission.
+ * If we are trying to cd(..), but the cwd is the root of our jail, then
+ * permission is denied.
+ */
+static int
+jail_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (!nd)
+ return 0;
+
+ if (nd->dentry &&
+ strcmp(nd->dentry->d_sb->s_type->name, "proc")==0) {
+ return jail_proc_inode_permission(inode, mask, nd);
+
+ }
+
+ if (!(mask&MAY_EXEC))
+ return 0;
+ if (!inode || !S_ISDIR(inode->i_mode))
+ return 0;
+
+ if (is_jailroot_parent(nd->dentry, tsec->dentry, tsec->mnt)) {
+ bsdj_debug(WARN,"Attempt to chdir(..) out of jail!\n"
+ "(%s is a subdir of %s)\n",
+ tsec->dentry->d_name.name,
+ nd->dentry->d_name.name);
+ return -EPERM;
+ }
+
+ return 0;
+}
+
+/*
+ * A function which returns -ENOENT if dentry is the dentry for
+ * a /proc/<pid> directory. It returns 0 otherwise.
+ */
+static inline int
+generic_procpid_check(struct dentry *dentry)
+{
+ struct jail_struct *jail = jail_of(current);
+ unsigned pid = name_to_int(dentry);
+
+ if (!jail || !in_use(jail))
+ return 0;
+ if (pid == ~0U)
+ return 0;
+ if (strcmp(dentry->d_sb->s_type->name, "proc")!=0)
+ return 0;
+ if (dentry->d_parent != dentry->d_sb->s_root)
+ return 0;
+ if (get_inode_security(dentry->d_inode) != jail)
+ return -ENOENT;
+ return 0;
+}
+
+/*
+ * We want getattr to fail on /proc/<pid> to prevent leakage through, for
+ * instance, ls -d.
+ */
+static int
+jail_inode_getattr(struct vfsmount *mnt, struct dentry *dentry)
+{
+ return generic_procpid_check(dentry);
+}
+
+/* This probably is not necessary - /proc does not support xattrs? */
+static int
+jail_inode_getxattr(struct dentry *dentry, char *name)
+{
+ return generic_procpid_check(dentry);
+}
+
+/* process in jail may not send signal to process not in the same jail */
+static int
+jail_task_kill(struct task_struct *p, struct siginfo *info, int sig)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (tsec == jail_of(p))
+ return 0;
+
+ if (sig==SIGCHLD)
+ return 0;
+
+ return -EPERM;
+}
+
+/*
+ * LSM hooks to limit jailed process' abilities to muck with resource
+ * limits
+ */
+static int jail_task_setrlimit (unsigned int resource, struct rlimit *new_rlim)
+{
+ if (!in_jail(current))
+ return 0;
+
+ return -EPERM;
+}
+
+static int jail_task_setscheduler (struct task_struct *p, int policy,
+ struct sched_param *lp)
+{
+ if (!in_jail(current))
+ return 0;
+
+ return -EPERM;
+}
+
+/*
+ * LSM hooks to limit IPC access.
+ */
+
+static inline int
+basic_ipc_security_check(struct kern_ipc_perm *p, struct task_struct *target)
+{
+ struct jail_struct *tsec = jail_of(target);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (get_ipc_security(p) != tsec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_ipc_permission(struct kern_ipc_perm *ipcp, short flag)
+{
+ return basic_ipc_security_check(ipcp, current);
+}
+
+static int
+jail_shm_alloc_security (struct shmid_kernel *shp)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(shp->shm_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_shm_free_security (struct shmid_kernel *shp)
+{
+ free_ipc_security(&shp->shm_perm);
+}
+
+static int
+jail_shm_associate (struct shmid_kernel *shp, int shmflg)
+{
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_shm_shmctl(struct shmid_kernel *shp, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == SHM_INFO)
+ return 0;
+
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_shm_shmat(struct shmid_kernel *shp, char *shmaddr, int shmflg)
+{
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_msg_queue_alloc(struct msg_queue *msq)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(msq->q_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_msg_queue_free(struct msg_queue *msq)
+{
+ free_ipc_security(&msq->q_perm);
+}
+
+static int jail_msg_queue_associate(struct msg_queue *msq, int flag)
+{
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgctl(struct msg_queue *msq, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == MSG_INFO)
+ return 0;
+
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg, int msqflg)
+{
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgrcv(struct msg_queue *msq, struct msg_msg *msg,
+ struct task_struct *target, long type, int mode)
+
+{
+ return basic_ipc_security_check(&msq->q_perm, target);
+}
+
+static int
+jail_sem_alloc_security(struct sem_array *sma)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(sma->sem_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_sem_free_security(struct sem_array *sma)
+{
+ free_ipc_security(&sma->sem_perm);
+}
+
+static int
+jail_sem_associate(struct sem_array *sma, int semflg)
+{
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static int
+jail_sem_semctl(struct sem_array *sma, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == SEM_INFO)
+ return 0;
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static int
+jail_sem_semop(struct sem_array *sma, struct sembuf *sops, unsigned nsops,
+ int alter)
+{
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static struct security_operations bsdjail_security_ops = {
+ .ptrace = jail_ptrace,
+ .capable = jail_capable,
+
+ .task_kill = jail_task_kill,
+ .task_alloc_security = jail_task_alloc_security,
+ .task_free_security = jail_task_free_security,
+ .bprm_alloc_security = jail_bprm_alloc_security,
+ .task_create = jail_security_task_create,
+ .task_to_inode = jail_task_to_inode,
+ .task_lookup = jail_task_lookup,
+
+ .task_setrlimit = jail_task_setrlimit,
+ .task_setscheduler = jail_task_setscheduler,
+
+ .setprocattr = jail_setprocattr,
+ .getprocattr = jail_getprocattr,
+
+ .file_set_fowner = jail_file_set_fowner,
+ .file_send_sigiotask = jail_file_send_sigiotask,
+ .file_free_security = free_file_security,
+
+ .socket_bind = jail_socket_bind,
+ .socket_listen = jail_socket_listen,
+ .socket_create = jail_socket_create,
+ .socket_post_create = jail_socket_post_create,
+ .unix_stream_connect = jail_socket_unix_stream_connect,
+ .unix_may_send = jail_socket_unix_may_send,
+ .sk_free_security = free_sock_security,
+
+ .inode_mknod = jail_inode_mknod,
+ .inode_permission = jail_inode_permission,
+ .inode_free_security = free_inode_security,
+ .inode_getattr = jail_inode_getattr,
+ .inode_getxattr = jail_inode_getxattr,
+ .sb_mount = jail_mount,
+ .sb_umount = jail_umount,
+
+ .ipc_permission = jail_ipc_permission,
+ .shm_alloc_security = jail_shm_alloc_security,
+ .shm_free_security = jail_shm_free_security,
+ .shm_associate = jail_shm_associate,
+ .shm_shmctl = jail_shm_shmctl,
+ .shm_shmat = jail_shm_shmat,
+
+ .msg_queue_alloc_security = jail_msg_queue_alloc,
+ .msg_queue_free_security = jail_msg_queue_free,
+ .msg_queue_associate = jail_msg_queue_associate,
+ .msg_queue_msgctl = jail_msg_queue_msgctl,
+ .msg_queue_msgsnd = jail_msg_queue_msgsnd,
+ .msg_queue_msgrcv = jail_msg_queue_msgrcv,
+
+ .sem_alloc_security = jail_sem_alloc_security,
+ .sem_free_security = jail_sem_free_security,
+ .sem_associate = jail_sem_associate,
+ .sem_semctl = jail_sem_semctl,
+ .sem_semop = jail_sem_semop,
+};
+
+static int __init bsdjail_init (void)
+{
+ int rc = 0;
+
+ if (register_security (&bsdjail_security_ops)) {
+ printk (KERN_INFO
+ "Failure registering BSD Jail module with the kernel\n");
+
+ rc = mod_reg_security(MY_NAME, &bsdjail_security_ops);
+ if (rc < 0) {
+ printk (KERN_INFO "Failure registering BSD Jail "
+ " module with primary security module.\n");
+ return -EINVAL;
+ }
+ secondary = 1;
+ }
+ printk (KERN_INFO "BSD Jail module initialized.\n");
+
+ return 0;
+}
+
+static void __exit bsdjail_exit (void)
+{
+ if (secondary) {
+ if (mod_unreg_security (MY_NAME, &bsdjail_security_ops))
+ printk (KERN_INFO "Failure unregistering BSD Jail "
+ " module with primary module.\n");
+ } else {
+ if (unregister_security (&bsdjail_security_ops)) {
+ printk (KERN_INFO "Failure unregistering BSD Jail "
+ "module with the kernel\n");
+ }
+ }
+
+ printk (KERN_INFO "BSD Jail module removed\n");
+}
+
+security_initcall (bsdjail_init);
+module_exit (bsdjail_exit);
+
+MODULE_DESCRIPTION("BSD Jail LSM.");
+MODULE_LICENSE("GPL");
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM (2/3)
2004-09-12 23:33 ` Serge E. Hallyn
@ 2004-09-13 10:56 ` Alan Cox
2004-09-13 15:08 ` Serge E. Hallyn
2004-09-13 23:20 ` [PATCH] BSD Jail LSM Serge Hallyn
0 siblings, 2 replies; 12+ messages in thread
From: Alan Cox @ 2004-09-13 10:56 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: Chris Wright, Linux Kernel Mailing List, akpm, netdev
On Llu, 2004-09-13 at 00:33, Serge E. Hallyn wrote:
> Right now one must choose between either an ipv4 or ipv6 interface.
> Is typical ipv6 usage such that it would be preferable to be able to
> specify one of each?
Its normal to have both yes.
A more interesting question is whether all of the "which socket for
which use" stuff could be addressed by netfilter chains run at
bind/connect time ?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM (2/3)
2004-09-13 10:56 ` Alan Cox
@ 2004-09-13 15:08 ` Serge E. Hallyn
2004-09-13 23:20 ` [PATCH] BSD Jail LSM Serge Hallyn
1 sibling, 0 replies; 12+ messages in thread
From: Serge E. Hallyn @ 2004-09-13 15:08 UTC (permalink / raw)
To: Alan Cox; +Cc: Chris Wright, Linux Kernel Mailing List, akpm, netdev
Quoting Alan Cox (alan@lxorguk.ukuu.org.uk):
> On Llu, 2004-09-13 at 00:33, Serge E. Hallyn wrote:
> > Right now one must choose between either an ipv4 or ipv6 interface.
> > Is typical ipv6 usage such that it would be preferable to be able to
> > specify one of each?
>
> Its normal to have both yes.
>
> A more interesting question is whether all of the "which socket for
> which use" stuff could be addressed by netfilter chains run at
> bind/connect time ?
You mean to add two new netfilter hooks? Would these then replace the
LSM hooks?
-serge
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM
2004-09-13 10:56 ` Alan Cox
2004-09-13 15:08 ` Serge E. Hallyn
@ 2004-09-13 23:20 ` Serge Hallyn
2004-09-13 23:58 ` Vincent Hanquez
1 sibling, 1 reply; 12+ messages in thread
From: Serge Hallyn @ 2004-09-13 23:20 UTC (permalink / raw)
To: Alan Cox; +Cc: Chris Wright, Linux Kernel Mailing List, akpm, netdev
[-- Attachment #1: Type: text/plain, Size: 450 bytes --]
On Mon, 2004-09-13 at 05:56, Alan Cox wrote:
> On Llu, 2004-09-13 at 00:33, Serge E. Hallyn wrote:
> > Right now one must choose between either an ipv4 or ipv6 interface.
> > Is typical ipv6 usage such that it would be preferable to be able to
> > specify one of each?
>
> Its normal to have both yes.
The attached version supports simultaneous ipv4 and ipv6 addresses.
(Though only one of each)
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
[-- Attachment #2: jail.diff --]
[-- Type: text/x-patch, Size: 39682 bytes --]
diff -Nru -p1 /home/hallyn/kernels/linux-2.6.8.1/security/bsdjail.c linux-2.6.8.1/security/bsdjail.c
--- /home/hallyn/kernels/linux-2.6.8.1/security/bsdjail.c 1969-12-31 18:00:00.000000000 -0600
+++ linux-2.6.8.1/security/bsdjail.c 2004-09-13 18:00:09.475302344 -0500
@@ -0,0 +1,1528 @@
+/*
+ * File: linux/security/bsdjail.c
+ * Author: Serge Hallyn (serue@us.ibm.com)
+ * Date: Sep 12, 2004
+ *
+ * (See Documentation/bsdjail.txt for more information)
+ *
+ * Copyright (C) 2004 International Business Machines <serue@us.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/config.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/security.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/proc_fs.h>
+#include <linux/in.h>
+#include <linux/in6.h>
+#include <linux/pagemap.h>
+#include <linux/ip.h>
+#include <net/ipv6.h>
+#include <linux/mount.h>
+#include <asm/uaccess.h>
+#include <linux/netdevice.h>
+#include <linux/inetdevice.h>
+#include <linux/seq_file.h>
+#include <linux/un.h>
+#include <linux/smp_lock.h>
+#include <linux/kref.h>
+
+static int jail_debug = 0;
+MODULE_PARM(jail_debug, "i");
+MODULE_PARM_DESC(jail_debug, "Print bsd jail debugging messages.\n");
+
+#define DBG 0
+#define WARN 1
+#define bsdj_debug(how, fmt, arg... ) \
+ do { \
+ if ( how || jail_debug ) \
+ printk(KERN_NOTICE "%s: %s: " fmt, \
+ MY_NAME, __FUNCTION__, \
+ ## arg ); \
+ } while ( 0 )
+
+#define MY_NAME "bsdjail"
+
+/* flag to keep track of how we were registered */
+static int secondary = 0;
+
+/*
+ * The task structure holding jail information.
+ * Taskp->security points to one of these (or is null).
+ * There is exactly one jail_struct for each jail. If >1 process
+ * are in the same jail, they share the same jail_struct.
+ */
+struct jail_struct {
+ struct kref kref;
+
+ /* these are set on writes to /proc/<pid>/attr/exec */
+ char *root_pathname; /* char * containing path to use as jail / */
+ char *ip4_addr_name; /* char * containing ip4 addr to use for jail */
+ char *ip6_addr_name; /* char * containing ip6 addr to use for jail */
+
+ /* these are set when a jail becomes active */
+ __u32 addr4; /* internal form of ip4_addr_name */
+ struct in6_addr addr6; /* internal form of ip6_addr_name */
+
+ struct dentry *dentry; /* dentry of fs root */
+ struct vfsmount *mnt; /* vfsmnt of fs root */
+
+ /* Resource limits. 0 = no limit */
+ int max_nrtask; /* maximum number of tasks within this jail. */
+ int cur_nrtask; /* current number of tasks within this jail. */
+ long maxtimeslice; /* max timeslice in ms for procs in this jail */
+ long nice; /* nice level for processes in this jail */
+ long max_data, max_memlock; /* equivalent to RLIMIT_{DATA,MEMLOCK} */
+/* values for the jail_flags field */
+#define IN_USE 1 /* if 0, task is setting up jail, not yet in it */
+#define GOT_IPV4 2
+#define GOT_IPV6 4 /* if 0, ipv4, else ipv6 */
+ char jail_flags;
+};
+
+#define in_use(x) (x->jail_flags & IN_USE)
+#define set_in_use(x) (x->jail_flags |= IN_USE)
+
+#define got_network(x) (x->jail_flags & (GOT_IPV4 | GOT_IPV6))
+#define got_ipv4(x) (x->jail_flags & (GOT_IPV4))
+#define got_ipv6(x) (x->jail_flags & (GOT_IPV6))
+#define set_ipv4(x) (x->jail_flags |= GOT_IPV4)
+#define set_ipv6(x) (x->jail_flags |= GOT_IPV6)
+#define unset_got_ipv4(x) (x->jail_flags &= ~GOT_IPV4)
+#define unset_got_ipv6(x) (x->jail_flags &= ~GOT_IPV6)
+
+/*
+ * structs, defines, and functions to cope with stacking
+ */
+
+#define get_task_security(task) (task->security)
+#define get_inode_security(inode) (inode->i_security)
+#define get_sock_security(sock) (sock->sk_security)
+#define get_file_security(file) (file->f_security)
+#define get_ipc_security(ipc) (ipc->security)
+
+#define jail_of(proc) (get_task_security(proc))
+
+/*
+ * disable_jail: A jail which was in use, but has no references
+ * left, is disabled - we free up the mountpoint and dentry, and
+ * give up our reference on the module.
+ *
+ * don't need to put namespace, it will be done automatically
+ * when the last process in jail is put.
+ * DO need to put the dentry and vfsmount
+ */
+static void
+disable_jail(struct jail_struct *tsec)
+{
+ dput(tsec->dentry);
+ mntput(tsec->mnt);
+ module_put(THIS_MODULE);
+}
+
+
+static void free_jail(struct jail_struct *tsec)
+{
+ if (!tsec)
+ return;
+
+ if (tsec->root_pathname)
+ kfree(tsec->root_pathname);
+ if (tsec->ip4_addr_name)
+ kfree(tsec->ip4_addr_name);
+ if (tsec->ip6_addr_name)
+ kfree(tsec->ip6_addr_name);
+ kfree(tsec);
+}
+
+#define set_task_security(task,data) task->security = data
+#define set_inode_security(inode,data) inode->i_security = data
+#define set_sock_security(sock,data) sock->sk_security = data
+#define set_file_security(file,data) file->f_security = data
+#define set_ipc_security(ipc,data) ipc.security = data
+
+/*
+ * jail_task_free_security: this is the callback hooked into LSM.
+ * If there was no task->security field for bsdjail, do nothing.
+ * If there was, but it was never put into use, free the jail.
+ * If there was, and the jail is in use, then decrement the usage
+ * count, and disable and free the jail if the usage count hits 0.
+ */
+static void jail_task_free_security(struct task_struct *task)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_task_security(task);
+
+ if (!tsec)
+ return;
+
+ if (!in_use(tsec)) {
+ /*
+ * someone did 'echo -n x > /proc/<pid>/attr/exec' but
+ * then forked before execing. Nuke the old info.
+ */
+ free_jail(tsec);
+ set_task_security(task,NULL);
+ return;
+ }
+ tsec->cur_nrtask--;
+ /* If this was the last process in the jail, delete the jail */
+ kref_put(&tsec->kref);
+}
+
+static struct jail_struct *
+alloc_task_security(struct task_struct *tsk)
+{
+ struct jail_struct *tsec;
+ tsec = kmalloc(sizeof(struct jail_struct), GFP_KERNEL);
+ if (!tsec)
+ return ERR_PTR(-ENOMEM);
+ memset(tsec, 0, sizeof(struct jail_struct));
+ set_task_security(tsk, tsec);
+ return tsec;
+}
+
+static inline int
+in_jail(struct task_struct *t)
+{
+ struct jail_struct *tsec = jail_of(t);
+
+ if (tsec && in_use(tsec))
+ return 1;
+
+ return 0;
+}
+
+/*
+ * If a network address was passed into /proc/<pid>/attr/exec,
+ * then process in its jail will only be allowed to bind/listen
+ * to that address.
+ */
+static void
+setup_netaddress(struct jail_struct *tsec)
+{
+ unsigned int a,b,c,d, i;
+ unsigned int x[8];
+
+ unset_got_ipv4(tsec);
+ tsec->addr4 = 0;
+ unset_got_ipv6(tsec);
+ ipv6_addr_set(&tsec->addr6, 0, 0, 0, 0);
+
+ if (tsec->ip4_addr_name) {
+ if (sscanf(tsec->ip4_addr_name,"%u.%u.%u.%u",&a,&b,&c,&d)!=4)
+ return;
+ if (a>255 || b>255 || c>255 || d>255)
+ return;
+ tsec->addr4 = htonl((a<<24)|(b<<16)|(c<<8)|d);
+ set_ipv4(tsec);
+ bsdj_debug(DBG, "Network (ipv4) set up (%s)\n",
+ tsec->ip4_addr_name);
+ }
+
+ if (tsec->ip6_addr_name) {
+ if (sscanf(tsec->ip6_addr_name,"%x:%x:%x:%x:%x:%x:%x:%x",
+ &x[0], &x[1], &x[2], &x[3], &x[4], &x[5], &x[6],
+ &x[7]) != 8) {
+ printk(KERN_INFO "%s: bad ipv6 addr %s\n", __FUNCTION__,
+ tsec->ip6_addr_name);
+ return;
+ }
+ for (i=0; i<8; i++) {
+ if (x[i] > 65535) {
+ printk("%s: %x > 65535 at %d\n", __FUNCTION__, x[i], i);
+ return;
+ }
+ tsec->addr6.in6_u.u6_addr16[i] = htons(x[i]);
+ }
+ set_ipv6(tsec);
+ bsdj_debug(DBG, "Network (ipv6) set up (%s)\n",
+ tsec->ip6_addr_name);
+ }
+}
+
+/* release_jail:
+ * Callback for kref_put to use for releasing a jail when its
+ * last user exits.
+ */
+static void release_jail(struct kref *kref)
+{
+ struct jail_struct *tsec;
+
+ tsec = container_of(kref,struct jail_struct,kref);
+ disable_jail(tsec);
+ free_jail(tsec);
+}
+
+/*
+ * enable_jail:
+ * Called when a process is placed into a new jail to handle the
+ * actual creation of the jail.
+ * Creates namespace
+ * Sets process root+pwd
+ * Stores the requested ip address
+ * Registers a unique pseudo-proc filesystem for this jail
+ */
+static int enable_jail(struct task_struct *tsk)
+{
+ struct nameidata nd;
+ struct jail_struct *tsec;
+ int retval = -EFAULT;
+
+ tsec = jail_of(tsk);
+ if (!tsec || !tsec->root_pathname)
+ goto out;
+
+ /*
+ * USE_JAIL_NAMESPACE: could be useful, so that future mounts outside
+ * the jail don't affect the jail. But it's not necessary, and
+ * requires exporting copy_namespace from fs/namespace.c
+ *
+ * Actually, it would also be useful for truly hiding
+ * information about mounts which do not exist in this jail.
+#define USE_JAIL_NAMESPACE
+ */
+#ifdef USE_JAIL_NAMESPACE
+ bsdj_debug(DBG, "bsdjail: copying namespace.\n");
+ retval = -EPERM;
+ if (copy_namespace(CLONE_NEWNS, tsk))
+ goto out;
+ bsdj_debug(DBG, "bsdjail: copied namespace.\n");
+#endif
+
+ /* find our new root directory */
+ bsdj_debug(DBG, "bsdjail: looking up %s\n", tsec->root_pathname);
+ retval = path_lookup(tsec->root_pathname, LOOKUP_FOLLOW | LOOKUP_DIRECTORY, &nd);
+ if (retval)
+ goto out;
+
+ bsdj_debug(DBG, "bsdjail: got %s, setting root to it\n", tsec->root_pathname);
+
+ /* and set the fsroot to it */
+ set_fs_root(tsk->fs, nd.mnt, nd.dentry);
+ set_fs_pwd(tsk->fs, nd.mnt, nd.dentry);
+
+ bsdj_debug(DBG, "bsdjail: root has been set. Have fun.\n");
+
+ /* set up networking */
+ if (tsec->ip4_addr_name || tsec->ip6_addr_name)
+ setup_netaddress(tsec);
+
+ tsec->cur_nrtask = 1;
+ if (tsec->nice)
+ set_user_nice(current, tsec->nice);
+ if (tsec->max_data) {
+ current->rlim[RLIMIT_DATA].rlim_cur = tsec->max_data;
+ current->rlim[RLIMIT_DATA].rlim_max = tsec->max_data;
+ }
+ if (tsec->max_memlock) {
+ current->rlim[RLIMIT_MEMLOCK].rlim_cur = tsec->max_memlock;
+ current->rlim[RLIMIT_MEMLOCK].rlim_max = tsec->max_memlock;
+ }
+ if (tsec->maxtimeslice) {
+ current->rlim[RLIMIT_CPU].rlim_cur = tsec->maxtimeslice;
+ current->rlim[RLIMIT_CPU].rlim_max = tsec->maxtimeslice;
+ }
+ /* success and end */
+ tsec->mnt = mntget(nd.mnt);
+ tsec->dentry = dget(nd.dentry);
+ path_release(&nd);
+ kref_init(&tsec->kref, release_jail);
+ set_in_use(tsec);
+
+ /* won't let ourselves be removed until this jail goes away */
+ try_module_get(THIS_MODULE);
+
+ return 0;
+
+out:
+ return retval;
+}
+
+/*
+ * LSM /proc/<pid>/attr hooks.
+ * You may write into /proc/<pid>/attr/exec:
+ * root /some/path
+ * ip 2.2.2.2
+ * These values will be used on the next exec() to set up your jail
+ * (assuming you're not already in a jail)
+ */
+static int
+jail_setprocattr(struct task_struct *p, char *name, void *value, size_t size)
+{
+ struct jail_struct *tsec = jail_of(current);
+ long val;
+ int start, len;
+
+ if (tsec && in_use(tsec))
+ return -EINVAL; /* let them guess why */
+
+ if (p != current || strcmp(name, "exec"))
+ return -EPERM;
+
+ if (strncmp(value, "root ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->root_pathname)
+ kfree(tsec->root_pathname);
+ start = 5;
+ len = size-start;
+ tsec->root_pathname = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->root_pathname)
+ return -ENOMEM;
+ strncpy(tsec->root_pathname, value+start, len);
+ tsec->root_pathname[len] = '\0';
+ } else if (strncmp(value, "ip ", 3)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->ip4_addr_name)
+ kfree(tsec->ip4_addr_name);
+ start = 3;
+ len = size-start;
+ tsec->ip4_addr_name = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->ip4_addr_name)
+ return -ENOMEM;
+ strncpy(tsec->ip4_addr_name, value+start, len);
+ tsec->ip4_addr_name[len] = '\0';
+ } else if (strncmp(value, "ip6 ", 4) == 0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ if (tsec->ip6_addr_name)
+ kfree(tsec->ip6_addr_name);
+ start = 4;
+ len = size-start;
+ tsec->ip6_addr_name = kmalloc(len+1, GFP_KERNEL);
+ if (!tsec->ip6_addr_name)
+ return -ENOMEM;
+ strncpy(tsec->ip6_addr_name, value+start, len);
+ tsec->ip6_addr_name[len] = '\0';
+
+ /* the next two are equivalent */
+ } else if (strncmp(value, "slice ", 6)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+6, NULL, 0);
+ tsec->maxtimeslice = val;
+ } else if (strncmp(value, "timeslice ", 10)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+10, NULL, 0);
+ tsec->maxtimeslice = val;
+ } else if (strncmp(value, "nrtask ", 7)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = (int) simple_strtol(value+7, NULL, 0);
+ if (val < 1)
+ return -EINVAL;
+ tsec->max_nrtask = val;
+ } else if (strncmp(value, "memlock ", 8)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+8, NULL, 0);
+ tsec->max_memlock = val;
+ } else if (strncmp(value, "data ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+5, NULL, 0);
+ tsec->max_data = val;
+ } else if (strncmp(value, "nice ", 5)==0) {
+ if (!tsec)
+ tsec = alloc_task_security(current);
+ if (IS_ERR(tsec))
+ return -ENOMEM;
+
+ val = simple_strtoul(value+5, NULL, 0);
+ tsec->nice = val;
+ } else
+ return -EINVAL;
+
+ return size;
+}
+
+static int print_jail_net_info(struct jail_struct *j, char *buf, int maxcnt)
+{
+ int len = 0;
+
+ if (j->ip4_addr_name)
+ len += snprintf(buf, maxcnt, "%s\n", j->ip4_addr_name);
+ if (j->ip6_addr_name)
+ len += snprintf(buf, maxcnt-len, "%s\n", j->ip6_addr_name);
+
+ return snprintf(buf, maxcnt, "No network information\n");
+}
+
+/*
+ * LSM /proc/<pid>/attr read hook.
+ *
+ * /proc/$$/attr/current output:
+ * If the reading process, say process 1001, is in a jail, then
+ * cat /proc/999/attr/current
+ * will print networking information.
+ * If the reading process, say process 1001, is not in a jail, then
+ * cat /proc/999/attr/current
+ * will return
+ * root: (root of jail)
+ * ip: (ip address of jail)
+ * if 999 is in a jail, or
+ * -EINVAL
+ * if 999 is not in a jail.
+ *
+ * /proc/$$/attr/exec output:
+ * A process in a jail gets -EINVAL for /proc/$$/attr/exec.
+ * A process not in a jail gets hints on starting a jail.
+ */
+static int
+jail_getprocattr(struct task_struct *p, char *name, void *value, size_t size)
+{
+ struct jail_struct *tsec;
+ int err = 0;
+
+ if (in_jail(current)) {
+ if (strcmp(name, "current")==0) {
+ /* provide network info */
+ err = print_jail_net_info(jail_of(current), value,
+ size);
+ return err;
+ }
+ return -EINVAL; /* let them guess why */
+ }
+
+ if (strcmp(name, "exec") == 0) {
+ /* Print usage some help */
+ err = snprintf(value, size,
+ "Valid keywords:\n"
+ "root <pathname>\n"
+ "ip <ip4-addr>\n"
+ "ip6 <ip6-addr>\n"
+ "nrtask <max number of tasks in this jail>\n"
+ "nice <nice level for processes in this jail>\n"
+ "slice <max timeslice per process in msecs>\n"
+ "data <max data size per process in bytes>\n"
+ "memlock <max lockable memory per process in bytes>\n");
+ return err;
+ }
+
+ if (strcmp(name, "current"))
+ return -EPERM;
+
+ tsec = jail_of(p);
+ if (!tsec || !in_use(tsec)) {
+ err = snprintf(value, size, "Not Jailed\n");
+ } else {
+ err = snprintf(value, size,
+ "Root: %s\nIPv4: %s\nIPv6: %s\n"
+ "max_nrtask %d current nrtask %d max_timeslice %lu "
+ "nice %lu\n"
+ "max_memlock %lu max_data %lu\n",
+ tsec->root_pathname,
+ tsec->ip4_addr_name ? tsec->ip4_addr_name : "(none)",
+ tsec->ip6_addr_name ? tsec->ip6_addr_name : "(none)",
+ tsec->max_nrtask, tsec->cur_nrtask, tsec->maxtimeslice,
+ tsec->nice, tsec->max_data, tsec->max_memlock);
+ }
+
+ return err;
+}
+
+/*
+ * Forbid a process in a jail from sending a signal to a process in another
+ * (or no) jail through file sigio.
+ *
+ * We consider the process which set the fowner to be the one sending the
+ * signal, rather than the one writing to the file. Therefore we store the
+ * jail of a process during jail_file_set_fowner, then check that against
+ * the jail of the process receiving the signal.
+ */
+static int
+jail_file_send_sigiotask(struct task_struct *tsk, struct fown_struct *fown,
+ int fd, int reason)
+{
+ struct file *file;
+ struct jail_struct *tsec, *fsec;
+
+ if (!in_jail(current))
+ return 0;
+
+ file = (struct file *)((long)fown - offsetof(struct file,f_owner));
+ tsec = jail_of(tsk);
+ fsec = get_file_security(file);
+
+ if (fsec != tsec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_file_set_fowner(struct file *file)
+{
+ struct jail_struct *tsec;
+
+ tsec = jail_of(current);
+ set_file_security(file, tsec);
+ if (tsec)
+ kref_get(&tsec->kref);
+
+ return 0;
+}
+
+static void free_ipc_security(struct kern_ipc_perm *ipc)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_ipc_security(ipc);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_ipc_security((*ipc), NULL);
+}
+
+static void free_file_security(struct file *file)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_file_security(file);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_file_security(file, NULL);
+}
+
+static void free_inode_security(struct inode *inode)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_inode_security(inode);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_inode_security(inode, NULL);
+}
+
+/*
+ * LSM ptrace hook:
+ * process in jail may not ptrace process not in the same jail
+ */
+static int
+jail_ptrace (struct task_struct *tracer, struct task_struct *tracee)
+{
+ struct jail_struct *tsec = jail_of(tracer);
+
+ if (tsec && in_use(tsec)) {
+ if (tsec == jail_of(tracee))
+ return 0;
+ return -EPERM;
+ }
+ return 0;
+}
+
+/*
+ * process in jail may only use one (aliased) ip address. If they try to
+ * attach to 127.0.0.1, that is remapped to their own address. If some
+ * other address (and not their own), deny permission
+ */
+static int jail_socket_unix_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen);
+
+#define loopbackaddr htonl((127 << 24) | 1)
+
+static inline int jail_inet4_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen, struct jail_struct *tsec)
+{
+ struct sockaddr_in *inaddr;
+ __u32 sin_addr, jailaddr;
+
+ if (!got_ipv4(tsec))
+ return -EPERM;
+
+ inaddr = (struct sockaddr_in *)address;
+ sin_addr = inaddr->sin_addr.s_addr;
+ jailaddr = tsec->addr4;
+
+ if (sin_addr == jailaddr)
+ return 0;
+
+ if (sin_addr == loopbackaddr || !sin_addr) {
+ bsdj_debug(DBG, "Got a loopback or 0 address\n");
+ sin_addr = jailaddr;
+ bsdj_debug(DBG, "Converted to: %u.%u.%u.%u\n",
+ NIPQUAD(sin_addr));
+ return 0;
+ }
+
+ return -EPERM;
+}
+
+static inline int
+jail_inet6_bind(struct socket *sock, struct sockaddr *address, int addrlen,
+ struct jail_struct *tsec)
+{
+ struct sockaddr_in6 *inaddr6;
+ struct in6_addr *sin6_addr, *jailaddr;
+
+ if (!got_ipv6(tsec))
+ return -EPERM;
+
+ inaddr6 = (struct sockaddr_in6 *)address;
+ sin6_addr = &inaddr6->sin6_addr;
+ jailaddr = &tsec->addr6;
+
+ if (ipv6_addr_cmp(jailaddr, sin6_addr)==0)
+ return 0;
+
+ if (ipv6_addr_cmp(sin6_addr, &in6addr_loopback)==0) {
+ ipv6_addr_copy(sin6_addr, jailaddr);
+ return 0;
+ }
+
+ printk(KERN_NOTICE "%s: DENYING\n", __FUNCTION__);
+ printk(KERN_NOTICE "%s: a %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x "
+ "j %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+ __FUNCTION__,
+ NIP6(*sin6_addr),
+ NIP6(*jailaddr));
+
+ return -EPERM;
+}
+
+static int
+jail_socket_bind(struct socket *sock, struct sockaddr *address, int addrlen)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (sock->sk->sk_family == AF_UNIX)
+ return jail_socket_unix_bind(sock, address, addrlen);
+
+ if (!got_network(tsec))
+ /* If we want to be strict, we could just
+ * deny net access when lacking a pseudo ip.
+ * For now we just allow it. */
+ return 0;
+
+ switch(address->sa_family) {
+ case AF_INET:
+ return jail_inet4_bind(sock, address, addrlen, tsec);
+
+ case AF_INET6:
+ return jail_inet6_bind(sock, address, addrlen, tsec);
+
+ default:
+ return 0;
+ }
+}
+
+/*
+ * If locked in an ipv6 jail, don't let them use ipv4, and vice versa
+ */
+static int
+jail_socket_create(int family, int type, int protocol, int kern)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec) || kern || !got_network(tsec))
+ return 0;
+
+ switch(family) {
+ case AF_INET:
+ if (got_ipv4(tsec))
+ return 0;
+ return -EPERM;
+ case AF_INET6:
+ if (got_ipv6(tsec))
+ return 0;
+ return -EPERM;
+ default:
+ return 0;
+ };
+
+ return 0;
+}
+
+static void
+jail_socket_post_create(struct socket *sock, int family, int type,
+ int protocol, int kern)
+{
+ struct inet_opt *inet;
+ struct ipv6_pinfo *inet6;
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec) || kern || !got_network(tsec))
+ return;
+
+ switch(family) {
+ case AF_INET:
+ inet = inet_sk(sock->sk);
+ inet->saddr = tsec->addr4;
+ break;
+ case AF_INET6:
+ inet6 = inet6_sk(sock->sk);
+ ipv6_addr_copy(&inet6->saddr, &tsec->addr6);
+ break;
+ default:
+ break;
+ };
+
+ return;
+}
+
+static int
+jail_socket_listen(struct socket *sock, int backlog)
+{
+ struct inet_opt *inet;
+ struct ipv6_pinfo *inet6;
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec) || !got_network(tsec))
+ return 0;
+
+ switch (sock->sk->sk_family) {
+ case AF_INET:
+ inet = inet_sk(sock->sk);
+ if (inet->saddr == tsec->addr4)
+ return 0;
+ return -EPERM;
+
+ case AF_INET6:
+ inet6 = inet6_sk(sock->sk);
+ if (ipv6_addr_cmp(&inet6->saddr, &tsec->addr6)==0)
+ return 0;
+ return -EPERM;
+
+ default:
+ return 0;
+
+ }
+}
+
+static void free_sock_security(struct sock *sk)
+{
+ struct jail_struct *tsec;
+
+ tsec = get_sock_security(sk);
+ if (!tsec)
+ return;
+ kref_put(&tsec->kref);
+ set_sock_security(sk, NULL);
+}
+
+/*
+ * The next three (socket) hooks prevent a process in a jail from sending
+ * data to a abstract unix domain socket which was bound outside the jail.
+ */
+static int
+jail_socket_unix_bind(struct socket *sock, struct sockaddr *address,
+ int addrlen)
+{
+ struct sockaddr_un *sunaddr;
+ struct jail_struct *tsec;
+
+ if (sock->sk->sk_family != AF_UNIX)
+ return 0;
+
+ sunaddr = (struct sockaddr_un *)address;
+ if (sunaddr->sun_path[0] != 0)
+ return 0;
+
+ tsec = jail_of(current);
+ set_sock_security(sock->sk, tsec);
+ if (tsec)
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+/*
+ * Note - we deny sends both from unjailed to jailed, and from jailed
+ * to unjailed. As well as, of course between different jails.
+ */
+static int
+jail_socket_unix_may_send(struct socket *sock, struct socket *other)
+{
+ struct jail_struct *tsec, *ssec;
+
+ tsec = jail_of(current); /* jail of sending process */
+ ssec = get_sock_security(other->sk); /* jail of receiver */
+
+ if (tsec != ssec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_socket_unix_stream_connect(struct socket *sock,
+ struct socket *other, struct sock *newsk)
+{
+ struct jail_struct *tsec, *ssec;
+
+ tsec = jail_of(current); /* jail of sending process */
+ ssec = get_sock_security(other->sk); /* jail of receiver */
+
+ if (tsec != ssec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_mount(char * dev_name, struct nameidata *nd, char * type,
+ unsigned long flags, void * data)
+{
+ if (in_jail(current))
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_umount(struct vfsmount *mnt, int flags)
+{
+ if (in_jail(current))
+ return -EPERM;
+
+ return 0;
+}
+
+/*
+ * process in jail may not:
+ * use nice
+ * change network config
+ * load/unload modules
+ */
+static int
+jail_capable (struct task_struct *tsk, int cap)
+{
+ if (in_jail(tsk)) {
+ if (cap == CAP_SYS_NICE)
+ return -EPERM;
+ if (cap == CAP_NET_ADMIN)
+ return -EPERM;
+ if (cap == CAP_SYS_MODULE)
+ return -EPERM;
+ if (cap == CAP_SYS_RAWIO)
+ return -EPERM;
+ }
+
+ if (cap_is_fs_cap (cap) ? tsk->fsuid == 0 : tsk->euid == 0)
+ return 0;
+ return -EPERM;
+}
+
+/*
+ * jail_security_task_create:
+ *
+ * If the current process is ina a jail, and that jail is about to exceed a
+ * maximum number of processes, then refuse to fork. If the maximum number
+ * of jails is listed as 0, then there is no limit for this jail, and we allow
+ * all forks.
+ */
+static inline int
+jail_security_task_create (unsigned long clone_flags)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (tsec->max_nrtask && tsec->cur_nrtask >= tsec->max_nrtask)
+ return -EPERM;
+ return 0;
+}
+
+/*
+ * The child of a process in a jail belongs in the same jail
+ */
+static int
+jail_task_alloc_security(struct task_struct *tsk)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ set_task_security(tsk, tsec);
+ kref_get(&tsec->kref);
+ tsec->cur_nrtask++;
+ if (tsec->maxtimeslice) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->maxtimeslice;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->maxtimeslice;
+ }
+ if (tsec->max_data) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->max_data;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->max_data;
+ }
+ if (tsec->max_memlock) {
+ tsk->rlim[RLIMIT_CPU].rlim_max = tsec->max_memlock;
+ tsk->rlim[RLIMIT_CPU].rlim_cur = tsec->max_memlock;
+ }
+ if (tsec->nice)
+ set_user_nice(current, tsec->nice);
+
+ return 0;
+}
+
+static int
+jail_bprm_alloc_security(struct linux_binprm *bprm)
+{
+ struct jail_struct *tsec;
+ int ret;
+
+ tsec = jail_of(current);
+ if (!tsec)
+ return 0;
+
+ if (in_use(tsec))
+ return 0;
+
+ if (tsec->root_pathname) {
+ ret = enable_jail(current);
+ if (ret) {
+ /* if we failed, nix out the root/ip requests */
+ jail_task_free_security(current);
+ return ret;
+ }
+ }
+ return 0;
+}
+
+/*
+ * Process in jail may not create devices
+ * Thanks to Brad Spender for pointing out fifos should be allowed.
+ */
+/* TODO: We may want to allow /dev/log, at least... */
+static int
+jail_inode_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+ if (!in_jail(current))
+ return 0;
+
+ if (S_ISFIFO(mode))
+ return 0;
+
+ return -EPERM;
+}
+
+/* yanked from fs/proc/base.c */
+static unsigned name_to_int(struct dentry *dentry)
+{
+ const char *name = dentry->d_name.name;
+ int len = dentry->d_name.len;
+ unsigned n = 0;
+
+ if (len > 1 && *name == '0')
+ goto out;
+ while (len-- > 0) {
+ unsigned c = *name++ - '0';
+ if (c > 9)
+ goto out;
+ if (n >= (~0U-9)/10)
+ goto out;
+ n *= 10;
+ n += c;
+ }
+ return n;
+out:
+ return ~0U;
+}
+
+/*
+ * jail_proc_inode_permission:
+ * called only when current is in a jail, and is trying to reach
+ * /proc/<pid>. We check whether <pid> is in the same jail as
+ * current. If not, permission is denied.
+ *
+ * NOTE: On the one hand, the task_to_inode(inode)->i_security
+ * approach seems cleaner, but on the other, this prevents us
+ * from unloading bsdjail for awhile...
+ */
+static int
+jail_proc_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct jail_struct *tsec = jail_of(current);
+ struct dentry *dentry = nd->dentry;
+ unsigned pid;
+
+ pid = name_to_int(dentry);
+ if (pid == ~0U) {
+ struct qstr *dname = &dentry->d_name;
+ if (strcmp(dname->name, "scsi")==0 ||
+ strcmp(dname->name, "sys")==0 ||
+ strcmp(dname->name, "ide")==0)
+ return -EPERM;
+ return 0;
+ }
+
+ if (dentry->d_parent != dentry->d_sb->s_root)
+ return 0;
+ if (get_inode_security(inode) != tsec)
+ return -ENOENT;
+
+ return 0;
+}
+
+/*
+ * Here is our attempt to prevent chroot escapes.
+ */
+static int
+is_jailroot_parent(struct dentry *candidate, struct dentry *root,
+ struct vfsmount *rootmnt)
+{
+ if (candidate == root)
+ return 0;
+
+ /* simple case: fs->root/.. == candidate */
+ if (root->d_parent == candidate)
+ return 1;
+
+ /*
+ * now more complicated: if fs->root is a mounted directory,
+ * then chdir(..) out of fs->root, at follow_dotdot, will follow
+ * the fs->root mount point. So we must check the parent dir of
+ * the fs->root mount point.
+ */
+ if (rootmnt->mnt_root == root && rootmnt->mnt_mountpoint!=root) {
+ root = rootmnt->mnt_mountpoint;
+ rootmnt = rootmnt->mnt_parent;
+ return is_jailroot_parent(candidate, root, rootmnt);
+ }
+
+ return 0;
+}
+
+/*
+ * A process in a jail may not see that /proc/<pid> exists for
+ * process not in its jail
+ * Unfortunately we can't pretend that pid for the starting process
+ * is 1, as vserver does.
+ */
+static int jail_task_lookup(struct task_struct *p)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec)
+ return 0;
+ if (tsec == jail_of(p))
+ return 0;
+ return -EPERM;
+}
+/*
+ * security_task_to_inode:
+ * Set inode->security = task's jail.
+ */
+static void jail_task_to_inode(struct task_struct *p, struct inode *inode)
+{
+ struct jail_struct *tsec = jail_of(p);
+
+ if (!tsec || !in_use(tsec))
+ return;
+ if (get_inode_security(inode))
+ return;
+ kref_get(&tsec->kref);
+ set_inode_security(inode, tsec);
+}
+
+/*
+ * inode_permission:
+ * If we are trying to look into certain /proc files from in a jail, we
+ * may deny permission.
+ * If we are trying to cd(..), but the cwd is the root of our jail, then
+ * permission is denied.
+ */
+static int
+jail_inode_permission(struct inode *inode, int mask,
+ struct nameidata *nd)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (!nd)
+ return 0;
+
+ if (nd->dentry &&
+ strcmp(nd->dentry->d_sb->s_type->name, "proc")==0) {
+ return jail_proc_inode_permission(inode, mask, nd);
+
+ }
+
+ if (!(mask&MAY_EXEC))
+ return 0;
+ if (!inode || !S_ISDIR(inode->i_mode))
+ return 0;
+
+ if (is_jailroot_parent(nd->dentry, tsec->dentry, tsec->mnt)) {
+ bsdj_debug(WARN,"Attempt to chdir(..) out of jail!\n"
+ "(%s is a subdir of %s)\n",
+ tsec->dentry->d_name.name,
+ nd->dentry->d_name.name);
+ return -EPERM;
+ }
+
+ return 0;
+}
+
+/*
+ * A function which returns -ENOENT if dentry is the dentry for
+ * a /proc/<pid> directory. It returns 0 otherwise.
+ */
+static inline int
+generic_procpid_check(struct dentry *dentry)
+{
+ struct jail_struct *jail = jail_of(current);
+ unsigned pid = name_to_int(dentry);
+
+ if (!jail || !in_use(jail))
+ return 0;
+ if (pid == ~0U)
+ return 0;
+ if (strcmp(dentry->d_sb->s_type->name, "proc")!=0)
+ return 0;
+ if (dentry->d_parent != dentry->d_sb->s_root)
+ return 0;
+ if (get_inode_security(dentry->d_inode) != jail)
+ return -ENOENT;
+ return 0;
+}
+
+/*
+ * We want getattr to fail on /proc/<pid> to prevent leakage through, for
+ * instance, ls -d.
+ */
+static int
+jail_inode_getattr(struct vfsmount *mnt, struct dentry *dentry)
+{
+ return generic_procpid_check(dentry);
+}
+
+/* This probably is not necessary - /proc does not support xattrs? */
+static int
+jail_inode_getxattr(struct dentry *dentry, char *name)
+{
+ return generic_procpid_check(dentry);
+}
+
+/* process in jail may not send signal to process not in the same jail */
+static int
+jail_task_kill(struct task_struct *p, struct siginfo *info, int sig)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (tsec == jail_of(p))
+ return 0;
+
+ if (sig==SIGCHLD)
+ return 0;
+
+ return -EPERM;
+}
+
+/*
+ * LSM hooks to limit jailed process' abilities to muck with resource
+ * limits
+ */
+static int jail_task_setrlimit (unsigned int resource, struct rlimit *new_rlim)
+{
+ if (!in_jail(current))
+ return 0;
+
+ return -EPERM;
+}
+
+static int jail_task_setscheduler (struct task_struct *p, int policy,
+ struct sched_param *lp)
+{
+ if (!in_jail(current))
+ return 0;
+
+ return -EPERM;
+}
+
+/*
+ * LSM hooks to limit IPC access.
+ */
+
+static inline int
+basic_ipc_security_check(struct kern_ipc_perm *p, struct task_struct *target)
+{
+ struct jail_struct *tsec = jail_of(target);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+
+ if (get_ipc_security(p) != tsec)
+ return -EPERM;
+
+ return 0;
+}
+
+static int
+jail_ipc_permission(struct kern_ipc_perm *ipcp, short flag)
+{
+ return basic_ipc_security_check(ipcp, current);
+}
+
+static int
+jail_shm_alloc_security (struct shmid_kernel *shp)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(shp->shm_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_shm_free_security (struct shmid_kernel *shp)
+{
+ free_ipc_security(&shp->shm_perm);
+}
+
+static int
+jail_shm_associate (struct shmid_kernel *shp, int shmflg)
+{
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_shm_shmctl(struct shmid_kernel *shp, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == SHM_INFO)
+ return 0;
+
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_shm_shmat(struct shmid_kernel *shp, char *shmaddr, int shmflg)
+{
+ return basic_ipc_security_check(&shp->shm_perm, current);
+}
+
+static int
+jail_msg_queue_alloc(struct msg_queue *msq)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(msq->q_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_msg_queue_free(struct msg_queue *msq)
+{
+ free_ipc_security(&msq->q_perm);
+}
+
+static int jail_msg_queue_associate(struct msg_queue *msq, int flag)
+{
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgctl(struct msg_queue *msq, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == MSG_INFO)
+ return 0;
+
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgsnd(struct msg_queue *msq, struct msg_msg *msg, int msqflg)
+{
+ return basic_ipc_security_check(&msq->q_perm, current);
+}
+
+static int
+jail_msg_queue_msgrcv(struct msg_queue *msq, struct msg_msg *msg,
+ struct task_struct *target, long type, int mode)
+
+{
+ return basic_ipc_security_check(&msq->q_perm, target);
+}
+
+static int
+jail_sem_alloc_security(struct sem_array *sma)
+{
+ struct jail_struct *tsec = jail_of(current);
+
+ if (!tsec || !in_use(tsec))
+ return 0;
+ set_ipc_security(sma->sem_perm, tsec);
+ kref_get(&tsec->kref);
+ return 0;
+}
+
+static void
+jail_sem_free_security(struct sem_array *sma)
+{
+ free_ipc_security(&sma->sem_perm);
+}
+
+static int
+jail_sem_associate(struct sem_array *sma, int semflg)
+{
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static int
+jail_sem_semctl(struct sem_array *sma, int cmd)
+{
+ if (cmd == IPC_INFO || cmd == SEM_INFO)
+ return 0;
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static int
+jail_sem_semop(struct sem_array *sma, struct sembuf *sops, unsigned nsops,
+ int alter)
+{
+ return basic_ipc_security_check(&sma->sem_perm, current);
+}
+
+static struct security_operations bsdjail_security_ops = {
+ .ptrace = jail_ptrace,
+ .capable = jail_capable,
+
+ .task_kill = jail_task_kill,
+ .task_alloc_security = jail_task_alloc_security,
+ .task_free_security = jail_task_free_security,
+ .bprm_alloc_security = jail_bprm_alloc_security,
+ .task_create = jail_security_task_create,
+ .task_to_inode = jail_task_to_inode,
+ .task_lookup = jail_task_lookup,
+
+ .task_setrlimit = jail_task_setrlimit,
+ .task_setscheduler = jail_task_setscheduler,
+
+ .setprocattr = jail_setprocattr,
+ .getprocattr = jail_getprocattr,
+
+ .file_set_fowner = jail_file_set_fowner,
+ .file_send_sigiotask = jail_file_send_sigiotask,
+ .file_free_security = free_file_security,
+
+ .socket_bind = jail_socket_bind,
+ .socket_listen = jail_socket_listen,
+ .socket_create = jail_socket_create,
+ .socket_post_create = jail_socket_post_create,
+ .unix_stream_connect = jail_socket_unix_stream_connect,
+ .unix_may_send = jail_socket_unix_may_send,
+ .sk_free_security = free_sock_security,
+
+ .inode_mknod = jail_inode_mknod,
+ .inode_permission = jail_inode_permission,
+ .inode_free_security = free_inode_security,
+ .inode_getattr = jail_inode_getattr,
+ .inode_getxattr = jail_inode_getxattr,
+ .sb_mount = jail_mount,
+ .sb_umount = jail_umount,
+
+ .ipc_permission = jail_ipc_permission,
+ .shm_alloc_security = jail_shm_alloc_security,
+ .shm_free_security = jail_shm_free_security,
+ .shm_associate = jail_shm_associate,
+ .shm_shmctl = jail_shm_shmctl,
+ .shm_shmat = jail_shm_shmat,
+
+ .msg_queue_alloc_security = jail_msg_queue_alloc,
+ .msg_queue_free_security = jail_msg_queue_free,
+ .msg_queue_associate = jail_msg_queue_associate,
+ .msg_queue_msgctl = jail_msg_queue_msgctl,
+ .msg_queue_msgsnd = jail_msg_queue_msgsnd,
+ .msg_queue_msgrcv = jail_msg_queue_msgrcv,
+
+ .sem_alloc_security = jail_sem_alloc_security,
+ .sem_free_security = jail_sem_free_security,
+ .sem_associate = jail_sem_associate,
+ .sem_semctl = jail_sem_semctl,
+ .sem_semop = jail_sem_semop,
+};
+
+static int __init bsdjail_init (void)
+{
+ int rc = 0;
+
+ if (register_security (&bsdjail_security_ops)) {
+ printk (KERN_INFO
+ "Failure registering BSD Jail module with the kernel\n");
+
+ rc = mod_reg_security(MY_NAME, &bsdjail_security_ops);
+ if (rc < 0) {
+ printk (KERN_INFO "Failure registering BSD Jail "
+ " module with primary security module.\n");
+ return -EINVAL;
+ }
+ secondary = 1;
+ }
+ printk (KERN_INFO "BSD Jail module initialized.\n");
+
+ return 0;
+}
+
+static void __exit bsdjail_exit (void)
+{
+ if (secondary) {
+ if (mod_unreg_security (MY_NAME, &bsdjail_security_ops))
+ printk (KERN_INFO "Failure unregistering BSD Jail "
+ " module with primary module.\n");
+ } else {
+ if (unregister_security (&bsdjail_security_ops)) {
+ printk (KERN_INFO "Failure unregistering BSD Jail "
+ "module with the kernel\n");
+ }
+ }
+
+ printk (KERN_INFO "BSD Jail module removed\n");
+}
+
+security_initcall (bsdjail_init);
+module_exit (bsdjail_exit);
+
+MODULE_DESCRIPTION("BSD Jail LSM.");
+MODULE_LICENSE("GPL");
diff -Nru -p1 /home/hallyn/kernels/linux-2.6.8.1/security/Kconfig linux-2.6.8.1/security/Kconfig
--- /home/hallyn/kernels/linux-2.6.8.1/security/Kconfig 2004-08-14 05:55:47.000000000 -0500
+++ linux-2.6.8.1/security/Kconfig 2004-09-13 11:49:28.000000000 -0500
@@ -48,2 +48,13 @@ source security/selinux/Kconfig
+config SECURITY_BSDJAIL
+ tristate "BSD Jail LSM"
+ depends on SECURITY
+ select SECURITY_NETWORK
+ help
+ Provides BSD Jail compartmentalization functionality.
+ See Documentation/bsdjail.txt for more information and
+ usage instructions.
+
+ If you are unsure how to answer this question, answer N.
+
endmenu
diff -Nru -p1 /home/hallyn/kernels/linux-2.6.8.1/security/Makefile linux-2.6.8.1/security/Makefile
--- /home/hallyn/kernels/linux-2.6.8.1/security/Makefile 2004-08-14 05:55:48.000000000 -0500
+++ linux-2.6.8.1/security/Makefile 2004-09-13 11:49:28.000000000 -0500
@@ -17 +17,2 @@ obj-$(CONFIG_SECURITY_CAPABILITIES) += c
obj-$(CONFIG_SECURITY_ROOTPLUG) += commoncap.o root_plug.o
+obj-$(CONFIG_SECURITY_BSDJAIL) += bsdjail.o
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM
2004-09-13 23:20 ` [PATCH] BSD Jail LSM Serge Hallyn
@ 2004-09-13 23:58 ` Vincent Hanquez
2004-09-14 14:04 ` Serge E. Hallyn
0 siblings, 1 reply; 12+ messages in thread
From: Vincent Hanquez @ 2004-09-13 23:58 UTC (permalink / raw)
To: Serge Hallyn
Cc: Alan Cox, Chris Wright, Linux Kernel Mailing List, akpm, netdev
On Mon, Sep 13, 2004 at 06:20:05PM -0500, Serge Hallyn wrote:
> +#define in_use(x) (x->jail_flags & IN_USE)
> +#define set_in_use(x) (x->jail_flags |= IN_USE)
> +
> +#define got_network(x) (x->jail_flags & (GOT_IPV4 | GOT_IPV6))
> +#define got_ipv4(x) (x->jail_flags & (GOT_IPV4))
> +#define got_ipv6(x) (x->jail_flags & (GOT_IPV6))
> +#define set_ipv4(x) (x->jail_flags |= GOT_IPV4)
> +#define set_ipv6(x) (x->jail_flags |= GOT_IPV6)
> +#define unset_got_ipv4(x) (x->jail_flags &= ~GOT_IPV4)
> +#define unset_got_ipv6(x) (x->jail_flags &= ~GOT_IPV6)
> +
> +#define get_task_security(task) (task->security)
> +#define get_inode_security(inode) (inode->i_security)
> +#define get_sock_security(sock) (sock->sk_security)
> +#define get_file_security(file) (file->f_security)
> +#define get_ipc_security(ipc) (ipc->security)
> +
> +#define jail_of(proc) (get_task_security(proc))
> +
> +#define set_task_security(task,data) task->security = data
> +#define set_inode_security(inode,data) inode->i_security = data
> +#define set_sock_security(sock,data) sock->sk_security = data
> +#define set_file_security(file,data) file->f_security = data
> +#define set_ipc_security(ipc,data) ipc.security = data
Hi Serge,
Do you really need all thoses macros ?
It seems to me that's too much macros for stuff which are easy
to write and to understand.
Just my 2cents,
--
Tab
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM
2004-09-13 23:58 ` Vincent Hanquez
@ 2004-09-14 14:04 ` Serge E. Hallyn
2004-09-14 18:13 ` Chris Wright
0 siblings, 1 reply; 12+ messages in thread
From: Serge E. Hallyn @ 2004-09-14 14:04 UTC (permalink / raw)
To: Vincent Hanquez; +Cc: Linux Kernel Mailing List
> Hi Serge,
>
> Do you really need all thoses macros ?
> It seems to me that's too much macros for stuff which are easy
> to write and to understand.
Hi,
the _security macros are there because I'm working with 3 ways of stacking
security modules which share the ->security fields, where these can
turn into static inlines. Being able to just change the defines has
been very helpful.
I guess I've grown used to seeing them so I didn't even notice. I
will send out a new patch with the #defines removed tomorrow if that's
deemed helpful.
thanks,
-serge
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] BSD Jail LSM
2004-09-14 14:04 ` Serge E. Hallyn
@ 2004-09-14 18:13 ` Chris Wright
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wright @ 2004-09-14 18:13 UTC (permalink / raw)
To: Serge E. Hallyn; +Cc: Vincent Hanquez, Linux Kernel Mailing List
* Serge E. Hallyn (serue@us.ibm.com) wrote:
> > Do you really need all thoses macros ?
> > It seems to me that's too much macros for stuff which are easy
> > to write and to understand.
>
> the _security macros are there because I'm working with 3 ways of stacking
> security modules which share the ->security fields, where these can
> turn into static inlines. Being able to just change the defines has
> been very helpful.
>
> I guess I've grown used to seeing them so I didn't even notice. I
> will send out a new patch with the #defines removed tomorrow if that's
> deemed helpful.
For now they are fine as they are.
thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2004-09-14 18:20 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-10 20:21 [PATCH] BSD Jail LSM (1/3) Serge Hallyn
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (2/3) Serge Hallyn
2004-09-10 19:31 ` Alan Cox
2004-09-12 23:33 ` Serge E. Hallyn
2004-09-13 10:56 ` Alan Cox
2004-09-13 15:08 ` Serge E. Hallyn
2004-09-13 23:20 ` [PATCH] BSD Jail LSM Serge Hallyn
2004-09-13 23:58 ` Vincent Hanquez
2004-09-14 14:04 ` Serge E. Hallyn
2004-09-14 18:13 ` Chris Wright
2004-09-12 21:12 ` [PATCH] BSD Jail LSM (2/3) Herbert Poetzl
2004-09-10 20:23 ` [PATCH] BSD Jail LSM (3/3) Serge Hallyn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox