public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe
@ 2009-01-16  5:55 Oleg Nesterov
  2009-01-16 10:35 ` Louis Rilling
  2009-01-16 22:21 ` Andrew Morton
  0 siblings, 2 replies; 5+ messages in thread
From: Oleg Nesterov @ 2009-01-16  5:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Pavel Emelyanov, Sukadev Bhattiprolu,
	linux-kernel

Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.

task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.

As for "special" pids, vnr/nr_ns helpers always need rcu. However,
if task != current, they are unsafe even under rcu lock, we can't
trust task->group_leader without the special checks.

And almost every helper has a callsite which needs a fix.

Also, it is a bit annoying that the implementations of, say,
task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".

This patch introduces the new helper, __task_pid_nr_ns(), which is
always safe to use, and turns all other helpers into the trivial
wrappers.

If this patch is acceptable, I'll send another one which converts
task_tgid_xxx() as well, there are a bit special.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>

--- CUR/include/linux/sched.h~SP_3_NR	2009-01-16 02:13:50.000000000 +0100
+++ CUR/include/linux/sched.h	2009-01-16 04:16:12.000000000 +0100
@@ -1481,17 +1481,23 @@ struct pid_namespace;
  *
  * see also pid_nr() etc in include/linux/pid.h
  */
+pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
+			struct pid_namespace *ns);
 
 static inline pid_t task_pid_nr(struct task_struct *tsk)
 {
 	return tsk->pid;
 }
 
-pid_t task_pid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns);
+static inline pid_t task_pid_nr_ns(struct task_struct *tsk,
+					struct pid_namespace *ns)
+{
+	return __task_pid_nr_ns(tsk, PIDTYPE_PID, ns);
+}
 
 static inline pid_t task_pid_vnr(struct task_struct *tsk)
 {
-	return pid_vnr(task_pid(tsk));
+	return __task_pid_nr_ns(tsk, PIDTYPE_PID, NULL);
 }
 
 
@@ -1513,11 +1519,15 @@ static inline pid_t task_pgrp_nr(struct 
 	return tsk->signal->__pgrp;
 }
 
-pid_t task_pgrp_nr_ns(struct task_struct *tsk, struct pid_namespace *ns);
+static inline pid_t task_pgrp_nr_ns(struct task_struct *tsk,
+					struct pid_namespace *ns)
+{
+	return __task_pid_nr_ns(tsk, PIDTYPE_PGID, ns);
+}
 
 static inline pid_t task_pgrp_vnr(struct task_struct *tsk)
 {
-	return pid_vnr(task_pgrp(tsk));
+	return __task_pid_nr_ns(tsk, PIDTYPE_PGID, NULL);
 }
 
 
@@ -1526,14 +1536,17 @@ static inline pid_t task_session_nr(stru
 	return tsk->signal->__session;
 }
 
-pid_t task_session_nr_ns(struct task_struct *tsk, struct pid_namespace *ns);
+static inline pid_t task_session_nr_ns(struct task_struct *tsk,
+					struct pid_namespace *ns)
+{
+	return __task_pid_nr_ns(tsk, PIDTYPE_SID, ns);
+}
 
 static inline pid_t task_session_vnr(struct task_struct *tsk)
 {
-	return pid_vnr(task_session(tsk));
+	return __task_pid_nr_ns(tsk, PIDTYPE_SID, NULL);
 }
 
-
 /**
  * pid_alive - check that a task structure is not stale
  * @p: Task structure to be checked.
--- CUR/kernel/pid.c~SP_3_NR	2009-01-16 02:54:26.000000000 +0100
+++ CUR/kernel/pid.c	2009-01-16 05:40:43.000000000 +0100
@@ -452,11 +452,24 @@ pid_t pid_vnr(struct pid *pid)
 }
 EXPORT_SYMBOL_GPL(pid_vnr);
 
-pid_t task_pid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
+pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
+			struct pid_namespace *ns)
 {
-	return pid_nr_ns(task_pid(tsk), ns);
+	pid_t nr = 0;
+
+	rcu_read_lock();
+	if (!ns)
+		ns = current->nsproxy->pid_ns;
+	if (likely(pid_alive(task))) {
+		if (type != PIDTYPE_PID)
+			task = task->group_leader;
+		nr = pid_nr_ns(task->pids[type].pid, ns);
+	}
+	rcu_read_unlock();
+
+	return nr;
 }
-EXPORT_SYMBOL(task_pid_nr_ns);
+EXPORT_SYMBOL(__task_pid_nr_ns);
 
 pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
 {
@@ -464,18 +477,6 @@ pid_t task_tgid_nr_ns(struct task_struct
 }
 EXPORT_SYMBOL(task_tgid_nr_ns);
 
-pid_t task_pgrp_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
-{
-	return pid_nr_ns(task_pgrp(tsk), ns);
-}
-EXPORT_SYMBOL(task_pgrp_nr_ns);
-
-pid_t task_session_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
-{
-	return pid_nr_ns(task_session(tsk), ns);
-}
-EXPORT_SYMBOL(task_session_nr_ns);
-
 struct pid_namespace *task_active_pid_ns(struct task_struct *tsk)
 {
 	return ns_of_pid(task_pid(tsk));


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe
  2009-01-16  5:55 [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe Oleg Nesterov
@ 2009-01-16 10:35 ` Louis Rilling
  2009-01-16 20:45   ` Oleg Nesterov
  2009-01-16 22:21 ` Andrew Morton
  1 sibling, 1 reply; 5+ messages in thread
From: Louis Rilling @ 2009-01-16 10:35 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Eric W. Biederman, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2540 bytes --]

Hi Oleg,

On 16/01/09  6:55 +0100, Oleg Nesterov wrote:
> Inho, the safety rules for vnr/nr_ns helpers are horrible and buggy.
> 
> task_pid_nr_ns(task) needs rcu/tasklist depending on task == current.
> 
> As for "special" pids, vnr/nr_ns helpers always need rcu. However,
> if task != current, they are unsafe even under rcu lock, we can't
> trust task->group_leader without the special checks.
> 
> And almost every helper has a callsite which needs a fix.
> 
> Also, it is a bit annoying that the implementations of, say,
> task_pgrp_vnr() and task_pgrp_nr_ns() are not "symmetrical".
> 
> This patch introduces the new helper, __task_pid_nr_ns(), which is
> always safe to use, and turns all other helpers into the trivial
> wrappers.
> 
> If this patch is acceptable, I'll send another one which converts
> task_tgid_xxx() as well, there are a bit special.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> 

[...]

> --- CUR/kernel/pid.c~SP_3_NR	2009-01-16 02:54:26.000000000 +0100
> +++ CUR/kernel/pid.c	2009-01-16 05:40:43.000000000 +0100
> @@ -452,11 +452,24 @@ pid_t pid_vnr(struct pid *pid)
>  }
>  EXPORT_SYMBOL_GPL(pid_vnr);
>  
> -pid_t task_pid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
> +pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
> +			struct pid_namespace *ns)
>  {
> -	return pid_nr_ns(task_pid(tsk), ns);
> +	pid_t nr = 0;
> +
> +	rcu_read_lock();
> +	if (!ns)
> +		ns = current->nsproxy->pid_ns;
> +	if (likely(pid_alive(task))) {

I don't see what this pid_alive() check buys you. Since tasklist_lock is not
enforced, nothing prevents another CPU from detaching the pid right after the
check.

I'm also a bit puzzled by your description with using tasklist_lock when task !=
current, and not seeing tasklist_lock anywhere in the patch. Does this mean that
"safe" is for "no access to freed memory is done, but caller has to take
tasklist_lock or may get 0 as return value"?

Thanks,

Louis

> +		if (type != PIDTYPE_PID)
> +			task = task->group_leader;
> +		nr = pid_nr_ns(task->pids[type].pid, ns);
> +	}
> +	rcu_read_unlock();
> +
> +	return nr;
>  }
> -EXPORT_SYMBOL(task_pid_nr_ns);
> +EXPORT_SYMBOL(__task_pid_nr_ns);
>  
>  pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
>  {

[...]

-- 
Dr Louis Rilling			Kerlabs
Skype: louis.rilling			Batiment Germanium
Phone: (+33|0) 6 80 89 08 23		80 avenue des Buttes de Coesmes
http://www.kerlabs.com/			35700 Rennes

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe
  2009-01-16 10:35 ` Louis Rilling
@ 2009-01-16 20:45   ` Oleg Nesterov
  2009-01-16 21:48     ` Louis Rilling
  0 siblings, 1 reply; 5+ messages in thread
From: Oleg Nesterov @ 2009-01-16 20:45 UTC (permalink / raw)
  To: Andrew Morton, Eric W. Biederman, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

Hi Louis,

On 01/16, Louis Rilling wrote:
>
> On 16/01/09  6:55 +0100, Oleg Nesterov wrote:
> > +			struct pid_namespace *ns)
> >  {
> > -	return pid_nr_ns(task_pid(tsk), ns);
> > +	pid_t nr = 0;
> > +
> > +	rcu_read_lock();
> > +	if (!ns)
> > +		ns = current->nsproxy->pid_ns;
> > +	if (likely(pid_alive(task))) {
>
> I don't see what this pid_alive() check buys you. Since tasklist_lock is not
> enforced, nothing prevents another CPU from detaching the pid right after the
> check.

pid_alive() should be renamed. We use it to make sure the task didn't pass
__unhash_process().

Yes, you are right, nothing prevents another CPU from detaching the pid right
after the  check. But this is fine: we read ->pids[].pid under rcu_read_lock(),
and if it is NULL pid_nr_ns() returns. So, we don't need pid_alive() check at
all.

However, we can not use task->group_leader unless we verify the task is still
alive. That is why we need this check. We do not clear ->group_leader when
the task exits, so we can't do

		rcu_read_lock();
		if (task->group_leader)
			do_something(task->group_leader);
		rcu_unread_lock();

Instead we use pid_alive() before using ->group_leader.

> I'm also a bit puzzled by your description with using tasklist_lock when task !=
> current, and not seeing tasklist_lock anywhere in the patch. Does this mean that
> "safe" is for "no access to freed memory is done, but caller has to take
> tasklist_lock or may get 0 as return value"?

I am not sure I understand the question...

This patch doesn't use tasklist, it relies on rcu. With this patch the caller
doesn't need tasklist/rcu to call these helpers (but of course, the caller
must ensure that task_struct is stable).

But, whatever the caller does, it can get 0 as return value anyway if the
task exists, this is correct. Or I misunderstood you?

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe
  2009-01-16 20:45   ` Oleg Nesterov
@ 2009-01-16 21:48     ` Louis Rilling
  0 siblings, 0 replies; 5+ messages in thread
From: Louis Rilling @ 2009-01-16 21:48 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Eric W. Biederman, Pavel Emelyanov,
	Sukadev Bhattiprolu, linux-kernel

On Fri, Jan 16, 2009 at 09:45:40PM +0100, Oleg Nesterov wrote:
> Hi Louis,
> 
> On 01/16, Louis Rilling wrote:
> >
> > On 16/01/09  6:55 +0100, Oleg Nesterov wrote:
> > > +			struct pid_namespace *ns)
> > >  {
> > > -	return pid_nr_ns(task_pid(tsk), ns);
> > > +	pid_t nr = 0;
> > > +
> > > +	rcu_read_lock();
> > > +	if (!ns)
> > > +		ns = current->nsproxy->pid_ns;
> > > +	if (likely(pid_alive(task))) {
> >
> > I don't see what this pid_alive() check buys you. Since tasklist_lock is not
> > enforced, nothing prevents another CPU from detaching the pid right after the
> > check.
> 
> pid_alive() should be renamed. We use it to make sure the task didn't pass
> __unhash_process().
> 
> Yes, you are right, nothing prevents another CPU from detaching the pid right
> after the  check. But this is fine: we read ->pids[].pid under rcu_read_lock(),
> and if it is NULL pid_nr_ns() returns. So, we don't need pid_alive() check at
> all.
> 
> However, we can not use task->group_leader unless we verify the task is still
> alive. That is why we need this check. We do not clear ->group_leader when
> the task exits, so we can't do
> 
> 		rcu_read_lock();
> 		if (task->group_leader)
> 			do_something(task->group_leader);
> 		rcu_unread_lock();
> 
> Instead we use pid_alive() before using ->group_leader.

Ok I see now. Since RCU is locked and pid_alive(task) has been true at
some point, task->group_leader cannot be freed because release_task()
does not release task->group_leader before releasing task. IOW
release_task() can't have started releasing task->group_leader before
rcu_read_lock().

Thank you for your explanation!

> 
> > I'm also a bit puzzled by your description with using tasklist_lock when task !=
> > current, and not seeing tasklist_lock anywhere in the patch. Does this mean that
> > "safe" is for "no access to freed memory is done, but caller has to take
> > tasklist_lock or may get 0 as return value"?
> 
> I am not sure I understand the question...
> 
> This patch doesn't use tasklist, it relies on rcu. With this patch the caller
> doesn't need tasklist/rcu to call these helpers (but of course, the caller
> must ensure that task_struct is stable).
> 
> But, whatever the caller does, it can get 0 as return value anyway if the
> task exists, this is correct. Or I misunderstood you?

My question was probably badly phrased. When reading the patch
description I thought that the point was to fix the helpers so that
tasklist_lock would be taken whenever task != current. Of course your
patch does not do this, and I'm perfectly fine with this.

Thanks,

Louis

-- 
Dr Louis Rilling			Kerlabs - IRISA
Skype: louis.rilling			Campus Universitaire de Beaulieu
Phone: (+33|0) 2 99 84 71 52		Avenue du General Leclerc
Fax: (+33|0) 2 99 84 71 71		35042 Rennes CEDEX - France
http://www.kerlabs.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe
  2009-01-16  5:55 [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe Oleg Nesterov
  2009-01-16 10:35 ` Louis Rilling
@ 2009-01-16 22:21 ` Andrew Morton
  1 sibling, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2009-01-16 22:21 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: ebiederm, xemul, sukadev, linux-kernel

On Fri, 16 Jan 2009 06:55:20 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> --- CUR/include/linux/sched.h~SP_3_NR	2009-01-16 02:13:50.000000000 +0100
> +++ CUR/include/linux/sched.h	2009-01-16 04:16:12.000000000 +0100
> @@ -1481,17 +1481,23 @@ struct pid_namespace;
>   *
>   * see also pid_nr() etc in include/linux/pid.h
>   */
> +pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
> +			struct pid_namespace *ns);

Sometime it would be nice to pull all the pid-related helpers out of
sched.h, into a dedicated header file which is included only by those
.c files which actually use them.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-01-17 11:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-16  5:55 [PATCH 3/3] pids: refactor vnr/nr_ns helpers to make them safe Oleg Nesterov
2009-01-16 10:35 ` Louis Rilling
2009-01-16 20:45   ` Oleg Nesterov
2009-01-16 21:48     ` Louis Rilling
2009-01-16 22:21 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox