From: Kirill Tkhai <tkhai@yandex.ru>
To: Oleg Nesterov <oleg@redhat.com>, Kirill Tkhai <ktkhai@parallels.com>
Cc: linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Vladimir Davydov <vdavydov@parallels.com>
Subject: Re: [PATCH RFC] sched: Revert delayed_put_task_struct() and fix use after free
Date: Thu, 16 Oct 2014 01:46:07 +0400 [thread overview]
Message-ID: <543EEB1F.3040900@yandex.ru> (raw)
In-Reply-To: <20141015194044.GA4557@redhat.com>
Yeah, you're sure about initial patch. Thanks for signal explanation.
On 15.10.2014 23:40, Oleg Nesterov wrote:
> On 10/15, Oleg Nesterov wrote:
>>
>> On 10/15, Kirill Tkhai wrote:
>>>
>>> Regarding to scheduler this may be a reason of use-after-free.
>>>
>>> task_numa_compare() schedule()
>>> rcu_read_lock() ...
>>> cur = ACCESS_ONCE(dst_rq->curr) ...
>>> ... rq->curr = next;
>>> ... context_switch()
>>> ... finish_task_switch()
>>> ... put_task_struct()
>>> ... __put_task_struct()
>>> ... free_task_struct()
>>> task_numa_assign() ...
>>> get_task_struct() ...
>>
>> Agreed. I don't understand this code (will try to take another look later),
>> but at first glance this looks wrong.
>>
>> At least the code like
>>
>> rcu_read_lock();
>> get_task_struct(foreign_rq->curr);
>> rcu_read_unlock();
>>
>> is certainly wrong. And _probably_ the problem should be fixed here. Perhaps
>> we can add try_to_get_task_struct() which does atomic_inc_not_zero() ...
>
> Yes, but perhaps in this particular case another simple fix makes more
> sense. The patch below needs a comment to explain that we check PF_EXITING
> because:
>
> 1. It doesn't make sense to migrate the exiting task. Although perhaps
> we could check ->mm == NULL instead.
>
> But let me repeat that I do not understand this code, I am not sure
> we can equally treat is_idle_task() and PF_EXITING here...
>
> 2. If PF_EXITING is not set (or ->mm != NULL) then delayed_put_task_struct()
> won't be called until we drop rcu_read_lock(), and thus get_task_struct()
> is safe.
>
Cool! Elegant fix. We set PF_EXITING in exit_signals(), which is earlier
than release_task() is called.
Shouldn't we use smp_rmb/smp_wmb here?
> And. it seems that there is another problem? Can't task_h_load(cur) race
> with itself if 2 CPU's call task_numa_migrate() and inspect the same rq
> in parallel? Again, I don't understand this code, but update_cfs_rq_h_load()
> doesn't look "atomic". In fact I am not even sure about task_h_load(env->p),
> p == current but we do not disable preemption.
>
> What do you think?
We use it completely unlocked, so nothing good is here. Also we work
with pointers.
As I understand in update_cfs_rq_h_load() we go from bottom to top,
and then from top to bottom. We set cfs_rq::h_load_next to be able
to do top-bottom passage (top is a root of "tree").
Yeah, this "way" may be overwritten by competitor. Also, task may change
its cfs_rq.
> --- x/kernel/sched/fair.c
> +++ x/kernel/sched/fair.c
> @@ -1165,7 +1165,7 @@ static void task_numa_compare(struct tas
>
> rcu_read_lock();
> cur = ACCESS_ONCE(dst_rq->curr);
> - if (cur->pid == 0) /* idle */
> + if (is_idle_task(cur) || (curr->flags & PF_EXITING))
> cur = NULL;
>
> /*
>
Looks like, we have to use the same fix for task_numa_group().
grp = rcu_dereference(tsk->numa_group);
Below we dereference grp->nr_tasks.
Also, the same in rt.c and deadline.c, but we do no take second
reference there. Wrong pointer dereference is not possible there,
not so bad.
Kirill
next prev parent reply other threads:[~2014-10-15 21:46 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-15 12:31 [PATCH RFC] sched: Revert delayed_put_task_struct() and fix use after free Kirill Tkhai
2014-10-15 15:06 ` Oleg Nesterov
2014-10-15 19:40 ` Oleg Nesterov
2014-10-15 21:46 ` Kirill Tkhai [this message]
2014-10-15 22:02 ` Kirill Tkhai
2014-10-16 7:59 ` Peter Zijlstra
2014-10-16 8:16 ` Kirill Tkhai
2014-10-16 9:43 ` Peter Zijlstra
2014-10-16 9:50 ` Kirill Tkhai
2014-10-16 9:51 ` Kirill Tkhai
2014-10-16 10:04 ` Kirill Tkhai
2014-10-17 21:34 ` Oleg Nesterov
2014-10-16 7:56 ` Peter Zijlstra
2014-10-16 8:01 ` Peter Zijlstra
2014-10-16 22:05 ` Oleg Nesterov
2014-10-17 21:36 ` [PATCH] sched/numa: fix unsafe get_task_struct() in task_numa_assign() Oleg Nesterov
2014-10-18 8:15 ` Kirill Tkhai
2014-10-18 8:33 ` Kirill Tkhai
2014-10-18 19:36 ` Peter Zijlstra
2014-10-18 21:18 ` Oleg Nesterov
2014-10-19 8:20 ` Kirill Tkhai
2014-10-18 20:56 ` Oleg Nesterov
2014-10-18 23:13 ` Kirill Tkhai
2014-10-19 19:24 ` Oleg Nesterov
2014-10-19 19:37 ` Oleg Nesterov
2014-10-19 19:43 ` Oleg Nesterov
2014-10-20 9:03 ` Kirill Tkhai
2014-10-20 9:13 ` Peter Zijlstra
2014-10-20 10:36 ` Kirill Tkhai
2014-10-20 9:00 ` Kirill Tkhai
2014-10-19 21:38 ` Peter Zijlstra
2014-10-20 8:56 ` Kirill Tkhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=543EEB1F.3040900@yandex.ru \
--to=tkhai@yandex.ru \
--cc=ktkhai@parallels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.