From: Oleg Nesterov <oleg@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: EunTaik Lee <eun.taik.lee@samsung.com>,
"mingo@redhat.com" <mingo@redhat.com>,
"peterz@infradead.org" <peterz@infradead.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched/pid fix use-after free in task_tgid_vnr
Date: Mon, 12 Dec 2016 14:46:11 +0100 [thread overview]
Message-ID: <20161212134611.GA11078@redhat.com> (raw)
In-Reply-To: <87wpf8hhfc.fsf@xmission.com>
On 12/10, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@redhat.com> writes:
>
> > On 12/09, EunTaik Lee wrote:
> >>
> >> There is a use-after-free case with below call stack.
> >>
> >> pid_nr_ns+0x10/0x38
> >> cgroup_pidlist_start+0x144/0x400
> >> cgroup_seqfile_start+0x1c/0x24
> >> kernfs_seq_start+0x54/0x90
> >> seq_read+0x15c/0x3a8
> >> kernfs_fop_read+0x38/0x160
> >> __vfs_read+0x28/0xc8
> >> vfs_read+0x84/0xfc
>
> How is this a use after free. The function pid_nr_ns should take a NULL pointer
> as input and return 0?
No, the task (task_struct) itself can't go away, but task->group_leader
can point to nowhere.
> Certainly if the addtion of pid_alive fixes it pid_vnr(task_tgid(tsk))
> is fine. Are we perhaps missing rcu locking?
rcu_read_lock() is not enough in this case, see below.
> Or is the problem simply that in task_tgid we are accessing
> task->group_leader which may already be dead?
Yes. Lets forget about the callchain above, I didn't even bother to verify
that it can actually hit the problem. Although I think EunTaik is very right,
css_task_iter_next() does get_task_struct() and drops css_set_lock, the task
can exit after that. Forget.
Just suppose that a task simply does
pid = task_tgid_vnr(current);
after it has already called exit_notify(). And this is what perf_event_pid()
does, perhaps we have more buggy users.
In this case current->group_leader or parent/real_parent can point to the
exited/freed tasks. I already said this many times, ee really need to nullify
them in __unhash_process() but this needs a lot of (mostly simple) cleanups.
> If so the fix needs to be
> in task_tgid.
Yes, task_tgid() should probably return NULL in this case, but this connects
to "a lot of cleanups" above.
> > --- x/include/linux/pid.h
> > +++ x/include/linux/pid.h
> > @@ -8,7 +8,8 @@ enum pid_type
> > PIDTYPE_PID,
> > PIDTYPE_PGID,
> > PIDTYPE_SID,
> > - PIDTYPE_MAX
> > + PIDTYPE_MAX,
> > + PIDTYPE_TGID /* do not use */
>
>
> I would do:
>
> /* __PIDTYPE_TGID is only valid to __task_pid_nr_ns */
> #define __PIDTYPE_TGID PIDTYPE_MAX
>
> Prefixing __PIDTYPE_TGID with __ should help make it clear
> this is a special use define.
OK, will do, thanks.
> I am also curious why pid_alive is the proper check to see if
> task->group_leader is valid? That feels like it could get us into
> trouble later.
It is. pid_alive(task) == T means that this task was not removed from
rcu-protected-lists and thus task->group_leader (in particular) can't
go away. As long as you check pid_alive() and use ->group_leader in
the same rcu_read_lock section, of course.
To clarify, this is not because detach_pid(PIDTYPE_PID) is called
before list_del_rcu(), this doesn't matter. What does matter is that
if you see pid_alive() == T (or task->sighand != NULL, or anything
else which means that release_task/__unhash_process was not called yet)
you know that this task, its leader/parent/real_parent/pids/etc can't
go away until rcu_read_unlock().
And this is why __task_pid_nr_ns() checks pid_alive() before it reads
->group_leader. Note that __task_pid_nr_ns(PIDTYPE_PID) does not need
this check, task->pids[PID] is nullified by detach_pid() and pid_nr_ns()
check pid != NULL.
However. I think it should be renamed. Or, better, we should add a new
helper to make this all more clear. Say,
bool task_is_rcu_safe(task)
{
return task->sighand != NULL;
}
then change __task_pid_nr_ns() to use this helper rather than pid_alive().
Because, once again, it is not that we need to ensure that
pids[PIDTYPE_PID].pid != NULL, we need to ensure that rcu_read_lock() can
actually protect this task and its leader/parent/etc. And of course, it
can have more users.
Oleg.
next prev parent reply other threads:[~2016-12-12 13:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20161209093351epcms1p418673c3cdec7d4c3e81b5df131173c57@epcms1p4>
2016-12-09 9:33 ` [PATCH] sched/pid fix use-after free in task_tgid_vnr EunTaik Lee
2016-12-09 17:21 ` Oleg Nesterov
2016-12-09 22:21 ` Eric W. Biederman
2016-12-12 13:46 ` Oleg Nesterov [this message]
2016-12-12 19:10 ` Eric W. Biederman
2016-12-13 16:03 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161212134611.GA11078@redhat.com \
--to=oleg@redhat.com \
--cc=ebiederm@xmission.com \
--cc=eun.taik.lee@samsung.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.