From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932855AbcLMQFQ (ORCPT ); Tue, 13 Dec 2016 11:05:16 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40964 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932495AbcLMQEK (ORCPT ); Tue, 13 Dec 2016 11:04:10 -0500 Date: Tue, 13 Dec 2016 17:03:41 +0100 From: Oleg Nesterov To: "Eric W. Biederman" Cc: EunTaik Lee , "mingo@redhat.com" , "peterz@infradead.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] sched/pid fix use-after free in task_tgid_vnr Message-ID: <20161213160340.GA4653@redhat.com> References: <20161209093351epcms1p418673c3cdec7d4c3e81b5df131173c57@epcms1p4> <20161209172114.GA25742@redhat.com> <87wpf8hhfc.fsf@xmission.com> <20161212134611.GA11078@redhat.com> <87bmwhdkv1.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87bmwhdkv1.fsf@xmission.com> User-Agent: Mutt/1.5.18 (2008-05-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 13 Dec 2016 16:04:10 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/13, Eric W. Biederman wrote: > > Oleg Nesterov writes: > > > In this case current->group_leader or parent/real_parent can point to the > > exited/freed tasks. I already said this many times, ee really need to nullify > > them in __unhash_process() but this needs a lot of (mostly simple) > > cleanups. > > Is there anything wrong with starting with the patch below? > > diff --git a/kernel/exit.c b/kernel/exit.c > index 9d68c45ebbe3..03daeecc335d 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -200,6 +200,7 @@ void release_task(struct task_struct *p) > if (zap_leader) > leader->exit_state = EXIT_DEAD; > } > + p->group_leader = NULL; Yes, see above, this is what we should do. Except I think we should change __unhash_process() to do this. The same for parent/real_parent. > That seems to cut to the heart of the matter. Exactly. But we can not start with this change. Just for example, task_session/task_pgrp/tgid can obviously crash, even if used properly, they will need to check group_leader != NULL. __task_pid_nr_ns() should be changed too. And more. That is why I'd prefer to start with the simple fix I suggested before, then do the cleanups. And imo that change makes sense in any case to avoid the code duplication, even if we change __unhash_process() to nullify group_leader/etc. Note that we can also add __PIDTYPE_PPID and turn task_ppid_nr_ns() into another trivial users of __task_pid_nr_ns(). Just it will need to check task != NULL instead if pid_alive(p). So, will you agree with v2 below? s/PIDTYPE_TGID/__PIDTYPE_TGID/ as you suggested, and move task_tgid_nr_ns() into sched.h close to other similar helpers. Oleg. --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -8,7 +8,9 @@ enum pid_type PIDTYPE_PID, PIDTYPE_PGID, PIDTYPE_SID, - PIDTYPE_MAX + PIDTYPE_MAX, + /* only valid to __task_pid_nr_ns() */ + __PIDTYPE_TGID }; /* --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2117,14 +2117,6 @@ static inline pid_t task_tgid_nr(struct task_struct *tsk) return tsk->tgid; } -pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns); - -static inline pid_t task_tgid_vnr(struct task_struct *tsk) -{ - return pid_vnr(task_tgid(tsk)); -} - - static inline int pid_alive(const struct task_struct *p); static inline pid_t task_ppid_nr_ns(const struct task_struct *tsk, struct pid_namespace *ns) { @@ -2166,6 +2158,16 @@ static inline pid_t task_session_vnr(struct task_struct *tsk) return __task_pid_nr_ns(tsk, PIDTYPE_SID, NULL); } +static inline pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns) +{ + return __task_pid_nr_ns(tsk, __PIDTYPE_TGID, ns); +} + +static inline pid_t task_tgid_vnr(struct task_struct *tsk) +{ + return __task_pid_nr_ns(tsk, __PIDTYPE_TGID, NULL); +} + /* obsolete, do not use */ static inline pid_t task_pgrp_nr(struct task_struct *tsk) { --- a/kernel/pid.c +++ b/kernel/pid.c @@ -526,8 +526,11 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type, if (!ns) ns = task_active_pid_ns(current); if (likely(pid_alive(task))) { - if (type != PIDTYPE_PID) + if (type != PIDTYPE_PID) { + if (type == __PIDTYPE_TGID) + type = PIDTYPE_PID; task = task->group_leader; + } nr = pid_nr_ns(rcu_dereference(task->pids[type].pid), ns); } rcu_read_unlock(); @@ -536,12 +539,6 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type, } EXPORT_SYMBOL(__task_pid_nr_ns); -pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns) -{ - return pid_nr_ns(task_tgid(tsk), ns); -} -EXPORT_SYMBOL(task_tgid_nr_ns); - struct pid_namespace *task_active_pid_ns(struct task_struct *tsk) { return ns_of_pid(task_pid(tsk));