From: Oleg Nesterov <oleg@tv-sign.ru>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@osdl.org>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/23] tref: Implement task references.
Date: Thu, 02 Mar 2006 22:16:09 +0300 [thread overview]
Message-ID: <44074479.15D306EB@tv-sign.ru> (raw)
In-Reply-To: m1mzgidnr0.fsf@ebiederm.dsl.xmission.com
Eric W. Biederman wrote:
>
> Holding a reference to a task_struct pins about 10K of low memory even after
> that task has exited. Which seems to be at 1 or 2 orders of mangnitude more
> memory than any other data structure in the kernel. Not holding a reference
> to a task_struct and you risk problems with pid wrap around.
I think there is another, much simpler solution. We can make a "reference" to the
pid itself to protect it against free_pidmap(), so that this pid can't be reused.
struct pid_ref
{
pid_t pid;
atomic_t count;
struct hlist_node chain;
};
// allocated in pidhash_init()
static struct hlist_head *ref_hash;
struct pid_ref *find_pid_ref(pid_t pid)
{
struct hlist_node *elem;
struct pid_ref *ref;
hlist_for_each_entry(ref, elem, &ref_hash[pid_hashfn(pid)], chain)
if (ref->pid == pid)
return ref;
return NULL;
}
// just s/free_pidmap/__free_pidmap/
static void __free_pidmap(int pid)
{
pidmap_t *map = pidmap_array + pid / BITS_PER_PAGE;
int offset = pid & BITS_PER_PAGE_MASK;
clear_bit(offset, map->page);
atomic_inc(&map->nr_free);
}
fastcall void free_pidmap(int pid)
{
if (!find_pid_ref(pid))
__free_pidmap(pid);
}
static int pid_inuse(pid_t pid)
{
int type;
for (type = 0; type < PIDTYPE_MAX; ++type)
if (find_pid(type, pid))
return 1;
return 0;
}
// simple, non-optimized version
struct pid_ref *mk_pid_ref(pid_t pid)
{
struct pid_ref *ref;
write_lock_irq(&tasklist_lock);
ref = find_pid_ref(pid);
if (ref)
atomic_inc(&ref->count);
else if (pid_inuse(pid)) {
ref = kmalloc(sizeof(*ref), GFP_ATOMIC);
if (ref) {
ref->pid = pid;
atomic_set(&ref->count, 1);
hlist_add_head(&ref->chain,
&ref_hash[pid_hashfn(pid)]);
}
}
write_unlock_irq(&tasklist_lock);
return ref;
}
void put_pid_ref(struct pid_ref *ref)
{
if (!ref || !atomic_dec_and_test(&ref->count))
return;
write_lock_irq(&tasklist_lock);
if (!atomic_read(&ref->count)) {
if (!pid_inuse(ref->pid))
__free_pidmap(ref->pid);
hlist_del(&ref->chain);
kfree(ref);
}
write_unlock_irq(&tasklist_lock);
}
That's all. The only modified function is free_pidmap(), and the change is
trivial. Example of usage:
struct fown_struct {
...
- int pid;
+ struct pid_ref *ref;
+ enum pid_type type;
...
}
void file_free(struct file *f)
{
+ put_pid_ref(f->f_owner->ref);
...
}
void f_modown(struct file *filp, int pid, uid_t uid, uid_t euid, int force)
{
struct pid_ref *old, *ref;
enum pid_type type = PIDTYPE_PID;
if (pid < 0) {
pid = -pid;
type = PIDTYPE_PGID;
}
ref = mk_pid_ref(pid);
write_lock_irq(&filp->f_owner.lock);
old = ref;
if (force || !filp->f_owner.ref) {
old = filp->f_owner.ref;
filp->f_owner.ref = ref;
filp->f_owner.type = type;
filp->f_owner.uid = uid;
filp->f_owner.euid = euid;
}
write_unlock_irq(&filp->f_owner.lock);
put_pid_ref(old);
}
void send_sigio(struct fown_struct *fown, int fd, int band)
{
struct task_struct *p;
read_lock(&fown->lock);
if (!fown->ref)
goto out_unlock_fown;
read_lock(&tasklist_lock);
do_each_task_pid(fown->ref->pid, fown->type, p)
send_sigio_to_task(p, fown, fd, band);
while_each_task_pid(fown->ref->pid, fown->type, p);
read_unlock(&tasklist_lock);
out_unlock_fown:
read_unlock(&fown->lock);
}
What do you think?
Oleg.
next prev parent reply other threads:[~2006-03-02 19:19 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-23 15:52 [PATCH 00/23] proc cleanup Eric W. Biederman
2006-02-23 15:54 ` [PATCH 01/23] tref: Implement task references Eric W. Biederman
2006-02-23 15:56 ` [PATCH 02/23] proc: Fix the .. inode number on /proc/<pid>/fd Eric W. Biederman
2006-02-23 15:57 ` [PATCH 03/23] proc: Remove useless BKL in proc_pid_readlink Eric W. Biederman
2006-02-23 15:58 ` [PATCH 04/23] proc: Remove unnecessary and misleading assignments from proc_pid_make_inode Eric W. Biederman
2006-02-23 16:00 ` [PATCH 05/23] proc: Simplify the ownership rules for /proc Eric W. Biederman
2006-02-23 16:02 ` Eric W. Biederman
2006-02-23 16:04 ` [PATCH 06/23] proc: Replace proc_inode.type with proc_inode.fd Eric W. Biederman
2006-02-23 16:05 ` [PATCH 07/23] proc: Remove bogus proc_task_permission Eric W. Biederman
2006-02-23 16:06 ` [PATCH 08/23] proc: Kill proc_mem_inode_operations Eric W. Biederman
2006-02-23 16:08 ` [PATCH 09/23] proc: Properly filter out files that are not visible to a process Eric W. Biederman
2006-02-23 16:10 ` [PATCH 10/23] proc: Fix the link count for /proc/<pid>/task Eric W. Biederman
2006-02-23 16:12 ` [PATCH 11/23] proc: Move proc_maps_operations into task_mmu.c Eric W. Biederman
2006-02-23 16:15 ` [PATCH 12/23] proc: Rewrite the proc dentry flush on exit optimization Eric W. Biederman
2006-02-23 16:16 ` [PATCH 13/23] proc: Close the race of a process dying durning lookup Eric W. Biederman
2006-02-23 16:18 ` [PATCH 14/23] proc: Make PROC_NUMBUF the buffer size for holding a integers as strings Eric W. Biederman
2006-02-23 16:20 ` [PATCH 15/23] proc: refactor reading directories of tasks Eric W. Biederman
2006-02-23 16:23 ` [PATCH 16/23] proc: Don't lock task_structs indefinitely Eric W. Biederman
2006-02-23 16:24 ` [PATCH 17/23] proc: Give the root directory a task Eric W. Biederman
2006-02-23 16:25 ` [PATCH 18/23] proc: Reorder the functions in base.c Eric W. Biederman
2006-02-23 16:27 ` [PATCH 19/23] proc: Modify proc_pident_lookup to be completely table driven Eric W. Biederman
2006-02-23 16:28 ` [PATCH 20/23] proc: Make the generation of the self symlink " Eric W. Biederman
2006-02-23 16:30 ` [PATCH 21/23] proc: Factor out an instantiate method from every lookup method Eric W. Biederman
2006-02-23 16:32 ` [PATCH 22/23] proc: Remove the hard coded inode numbers Eric W. Biederman
2006-02-23 16:34 ` [PATCH 23/23] proc: Merge proc_tid_attr and proc_tgid_attr Eric W. Biederman
2006-02-23 16:49 ` [PATCH 01/23] tref: Implement task references Eric W. Biederman
2006-03-02 19:16 ` Oleg Nesterov [this message]
2006-03-02 20:37 ` Oleg Nesterov
2006-03-02 22:19 ` Eric W. Biederman
2006-03-03 16:56 ` Oleg Nesterov
2006-03-03 17:48 ` Eric W. Biederman
2006-03-04 11:16 ` Eric W. Biederman
2006-03-04 12:31 ` Oleg Nesterov
2006-03-04 17:30 ` Oleg Nesterov
2006-03-06 21:06 ` Oleg Nesterov
2006-03-06 22:18 ` Eric W. Biederman
2006-03-07 20:44 ` Oleg Nesterov
2006-03-07 1:39 ` Eric W. Biederman
2006-03-07 20:38 ` Oleg Nesterov
2006-03-07 13:12 ` Eric W. Biederman
2006-03-07 21:02 ` Oleg Nesterov
2006-03-07 23:00 ` Eric W. Biederman
2006-03-03 19:23 ` Oleg Nesterov
2006-03-04 10:51 ` Eric W. Biederman
2006-02-25 12:27 ` [PATCH 00/23] proc cleanup Andrew Morton
2006-02-25 13:34 ` Eric W. Biederman
2006-02-25 15:20 ` Eric W. Biederman
2006-02-27 15:26 ` Serge E. Hallyn
2006-02-27 15:56 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44074479.15D306EB@tv-sign.ru \
--to=oleg@tv-sign.ru \
--cc=akpm@osdl.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.