From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762750AbXGZGnY (ORCPT ); Thu, 26 Jul 2007 02:43:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754419AbXGZGnQ (ORCPT ); Thu, 26 Jul 2007 02:43:16 -0400 Received: from mailhub.sw.ru ([195.214.233.200]:17451 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755126AbXGZGnP (ORCPT ); Thu, 26 Jul 2007 02:43:15 -0400 Message-ID: <46A8424E.1010805@openvz.org> Date: Thu, 26 Jul 2007 10:42:22 +0400 From: Pavel Emelyanov User-Agent: Thunderbird 2.0.0.5 (X11/20070716) MIME-Version: 1.0 To: sukadev@us.ibm.com CC: Serge Hallyn , Linux Containers , Linux Kernel Mailing List Subject: Re: [PATCH 13/16] Switch to operating with pid_numbers instead of pids References: <468DF6F7.1010906@openvz.org> <468DF907.2060809@openvz.org> <20070725003619.GF3287@us.ibm.com> <46A720C8.5070803@openvz.org> <20070725191334.GA19976@us.ibm.com> In-Reply-To: <20070725191334.GA19976@us.ibm.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org sukadev@us.ibm.com wrote: > Pavel Emelianov [xemul@openvz.org] wrote: > | sukadev@us.ibm.com wrote: > | >Pavel Emelianov [xemul@openvz.org] wrote: > | >| Make alloc_pid() initialize pid_numbers and hash them > | >| into the hashtable, not the struct pid itself. > | >| > | >| Signed-off-by: Pavel Emelianov > | >| > | >| --- > | >| > | >| pid.c | 47 +++++++++++++++++++++++++++++++++-------------- > | >| 1 files changed, 33 insertions(+), 14 deletions(-) > | >| > | >| --- ./kernel/pid.c.ve12 2007-07-05 11:06:41.000000000 +0400 > | >| +++ ./kernel/pid.c 2007-07-05 11:08:23.000000000 +0400 > | >| @@ -28,8 +28,10 @@ > | >| #include > | >| #include > | >| #include > | >| +#include > | >| > | >| -#define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift) > | >| +#define pid_hashfn(nr, ns) \ > | >| + hash_long((unsigned long)nr + (unsigned long)ns, pidhash_shift) > | >| static struct hlist_head *pid_hash; > | >| static int pidhash_shift; > | >| struct pid init_struct_pid = INIT_STRUCT_PID; > | >| @@ -194,7 +198,7 @@ fastcall void put_pid(struct pid *pid) > | >| if (!pid) > | >| return; > | >| > | >| - ns = pid->numbers[0].ns; > | >| + ns = pid->numbers[pid->level].ns; > | >| if ((atomic_read(&pid->count) == 1) || > | >| atomic_dec_and_test(&pid->count)) > | >| kmem_cache_free(ns->pid_cachep, pid); > | >| @@ -210,13 +214,17 @@ static void delayed_put_pid(struct rcu_h > | >| fastcall void free_pid(struct pid *pid) > | >| { > | >| /* We can be called with write_lock_irq(&tasklist_lock) held */ > | >| + int i; > | >| unsigned long flags; > | >| > | >| spin_lock_irqsave(&pidmap_lock, flags); > | >| - hlist_del_rcu(&pid->pid_chain); > | >| + for (i = 0; i <= pid->level; i++) > | >| + hlist_del_rcu(&pid->numbers[i].pid_chain); > | >| spin_unlock_irqrestore(&pidmap_lock, flags); > | >| > | >| - free_pidmap(&init_pid_ns, pid->nr); > | >| + for (i = 0; i <= pid->level; i++) > | >| + free_pidmap(pid->numbers[i].ns, pid->numbers[i].nr); > | >| + > | >| call_rcu(&pid->rcu, delayed_put_pid); > | >| } > | >| > | >| @@ -224,30 +232,43 @@ struct pid *alloc_pid(struct pid_namespa > | >| { > | >| struct pid *pid; > | >| enum pid_type type; > | >| - int nr = -1; > | >| + struct pid_namespace *ns; > | >| + int i, nr; > | >| > | >| - pid = kmem_cache_alloc(init_pid_ns.pid_cachep, GFP_KERNEL); > | >| + pid = kmem_cache_alloc(pid_ns->pid_cachep, GFP_KERNEL); > | >| if (!pid) > | >| goto out; > | >| > | >| - nr = alloc_pidmap(current->nsproxy->pid_ns); > | >| - if (nr < 0) > | >| - goto out_free; > | >| + ns = pid_ns; > | >| + for (i = pid_ns->level; i >= 0; i--) { > | >| + nr = alloc_pidmap(ns); > | >| + if (nr < 0) > | >| + goto out_free; > | > > | >If pid_ns->level is say 3 and alloc_pidmap() succeeds when i=0,1 > | > | It cannot :) If level is 3, then we'll allocate for 3, 2, 1, 0 sequence. > | The loop is descending, not ascending... > > Aah descending - thats right. But I still think there is a problem. > > Here is your code that I am referring to: > > pid = kmem_cache_alloc(pid_ns->pid_cachep, GFP_KERNEL); > if (!pid) > goto out; > > ns = pid_ns; > for (i = pid_ns->level; i >= 0; i--) { > nr = alloc_pidmap(ns); > if (nr < 0) > goto out_free; > > pid->numbers[i].nr = nr; > pid->numbers[i].ns = ns; > ns = ns->parent; > } > > pid->level = pid_ns->level; > > > > out: > return pid; > out_free: > for (i++; i <= pid->level; i++) > free_pidmap(pid->numbers[i].ns, pid->numbers[i].nr); > > kmem_cache_free(pid_ns->pid_cachep, pid); > pid = NULL; > goto out; > > > > Lets say initially pid_ns->level = 3 and alloc_pidmap() succeeds for > i=3 and i=2 but fails for i=1 and we execute "goto out_free". > > But pid->level is uninitialized at this point right ? > > Even if it were set to zero (using kmem_cache_zalloc()), we may not > free the two pidmap entries we allocated for i=3 and i=2. Yes. I found this after detailed look at the code and (hope) fixed. > Suka > Thanks, Pavel