From: Kirill Tkhai <ktkhai@parallels.com>
To: Sasha Levin <sasha.levin@oracle.com>
Cc: <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@redhat.com>,
"Vladimir Davydov" <vdavydov@parallels.com>,
Kirill Tkhai <tkhai@yandex.ru>
Subject: Re: [PATCH v4] sched/numa: fix unsafe get_task_struct() in task_numa_assign()
Date: Sun, 9 Nov 2014 17:07:00 +0300 [thread overview]
Message-ID: <1415542020.474.22.camel@tkhai> (raw)
In-Reply-To: <545D928B.2070508@oracle.com>
Hi,
В Пт, 07/11/2014 в 22:48 -0500, Sasha Levin пишет:
> On 10/22/2014 03:17 AM, Kirill Tkhai wrote:
> > Unlocked access to dst_rq->curr in task_numa_compare() is racy.
> > If curr task is exiting this may be a reason of use-after-free:
> [...]
>
> I've complained about an unrelated issue in that part of the code
> a while back (https://lkml.org/lkml/2014/5/12/508) which PeterZ
> ended up fixing (https://lkml.org/lkml/2014/5/21/428) but it seems
> that both of us forgot to follow up on that and the fix never got
> upstream.
>
> Ever since this patch made it upstream, Peter's patch which I was
> carrying in my tree stopped applying and I've started seeing:
>
> [ 829.539183] BUG: spinlock recursion on CPU#10, trinity-c594/11067
> [ 829.539203] lock: 0xffff880631dd6b80, .magic: dead4ead, .owner: trinity-c594/11067, .owner_cpu: 13
> [ 829.539212] CPU: 10 PID: 11067 Comm: trinity-c594 Not tainted 3.18.0-rc3-next-20141106-sasha-00054-g09b7ccf-dirty #1448
> [ 829.539226] 0000000000000000 0000000000000000 ffff880053acb000 ffff88032b71f828
> [ 829.539235] ffffffffa009fb5a 0000000000000057 ffff880631dd6b80 ffff88032b71f868
> [ 829.539243] ffffffff963f0c57 ffff880053acbd80 ffff880053acbdb0 ffff88032b71f858
> [ 829.539246] Call Trace:
> [ 829.539265] dump_stack (lib/dump_stack.c:52)
> [ 829.539277] spin_dump (kernel/locking/spinlock_debug.c:68 (discriminator 8))
> [ 829.539282] spin_bug (kernel/locking/spinlock_debug.c:76)
> [ 829.539288] do_raw_spin_lock (kernel/locking/spinlock_debug.c:84 kernel/locking/spinlock_debug.c:135)
> [ 829.539304] ? __schedule (kernel/sched/core.c:2803)
> [ 829.539313] _raw_spin_lock_irq (include/linux/spinlock_api_smp.h:129 kernel/locking/spinlock.c:167)
> [ 829.539321] ? task_numa_find_cpu (kernel/sched/fair.c:1258 kernel/sched/fair.c:1385)
> [ 829.539330] ? rcu_is_watching (./arch/x86/include/asm/preempt.h:95 kernel/rcu/tree.c:827)
> [ 829.539336] task_numa_find_cpu (kernel/sched/fair.c:1258 kernel/sched/fair.c:1385)
> [ 829.539342] ? task_numa_find_cpu (kernel/sched/fair.c:1253 kernel/sched/fair.c:1385)
> [ 829.539352] ? preempt_count_sub (kernel/sched/core.c:2644)
> [ 829.539358] task_numa_migrate (kernel/sched/fair.c:1452)
> [ 829.539364] ? task_numa_migrate (kernel/sched/fair.c:1391)
> [ 829.539376] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:87 arch/x86/kernel/kvmclock.c:85)
> [ 829.539386] ? sched_clock (./arch/x86/include/asm/paravirt.h:192 arch/x86/kernel/tsc.c:304)
> [ 829.539392] ? sched_clock_local (kernel/sched/clock.c:202)
> [ 829.539399] numa_migrate_preferred (kernel/sched/fair.c:1539)
> [ 829.539404] ? sched_clock_local (kernel/sched/clock.c:202)
> [ 829.539411] task_numa_fault (kernel/sched/fair.c:2073)
> [ 829.539417] ? sched_clock_cpu (kernel/sched/clock.c:311)
> [ 829.539429] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
> [ 829.539438] ? get_lock_stats (kernel/locking/lockdep.c:249)
> [ 829.539446] ? get_parent_ip (kernel/sched/core.c:2588)
> [ 829.539461] handle_mm_fault (mm/memory.c:3187 mm/memory.c:3233 mm/memory.c:3346 mm/memory.c:3375)
> [ 829.539466] ? find_vma (mm/mmap.c:2048)
> [ 829.539477] __do_page_fault (arch/x86/mm/fault.c:1246)
> [ 829.539485] ? context_tracking_user_exit (kernel/context_tracking.c:144)
> [ 829.539491] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> [ 829.539498] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2640 (discriminator 8))
> [ 829.539505] trace_do_page_fault (arch/x86/mm/fault.c:1329 include/linux/jump_label.h:114 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1330)
> [ 829.539510] do_async_page_fault (arch/x86/kernel/kvm.c:280)
> [ 829.539516] async_page_fault (arch/x86/kernel/entry_64.S:1301)
The bellow is equal to the patch suggested by Peter.
The only thing, I'm doubt, is about the comparison of cpus instead of nids.
Should task_numa_compare() be able to be called with src_nid == dst_nid
like this may happens now?! Maybe better, we should change task_numa_migrate()
and check for env.dst_nid != env.src.nid.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 826fdf3..c18129e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1380,6 +1380,8 @@ static void task_numa_find_cpu(struct task_numa_env *env,
/* Skip this CPU if the source task cannot migrate */
if (!cpumask_test_cpu(cpu, tsk_cpus_allowed(env->p)))
continue;
+ if (cpu == env->src_cpu)
+ continue;
env->dst_cpu = cpu;
task_numa_compare(env, taskimp, groupimp);
next prev parent reply other threads:[~2014-11-09 14:07 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-22 7:17 [PATCH v4] sched/numa: fix unsafe get_task_struct() in task_numa_assign() Kirill Tkhai
2014-10-22 21:30 ` introduce task_rcu_dereference? Oleg Nesterov
2014-10-22 22:23 ` Oleg Nesterov
2014-10-23 18:15 ` Oleg Nesterov
2014-10-23 8:10 ` Kirill Tkhai
2014-10-23 18:18 ` Oleg Nesterov
2014-10-24 7:51 ` Kirill Tkhai
2014-10-27 19:53 ` [PATCH 0/3] introduce task_rcu_dereference() Oleg Nesterov
2014-10-27 19:54 ` [PATCH 1/3] probe_kernel_address() can use __probe_kernel_read() Oleg Nesterov
2014-10-27 19:54 ` [PATCH 2/3] introduce probe_slab_address() Oleg Nesterov
2014-10-27 19:21 ` Christoph Lameter
2014-10-28 5:44 ` Kirill Tkhai
2014-10-28 5:48 ` Kirill Tkhai
2014-10-28 15:01 ` Peter Zijlstra
2014-10-28 17:56 ` Kirill Tkhai
2014-10-28 18:00 ` Kirill Tkhai
2014-10-28 19:55 ` Oleg Nesterov
2014-10-28 20:12 ` Oleg Nesterov
2014-10-29 5:10 ` Kirill Tkhai
2014-10-27 19:54 ` [PATCH 3/3] introduce task_rcu_dereference() Oleg Nesterov
2014-10-28 6:22 ` Kirill Tkhai
2016-05-18 17:02 ` Peter Zijlstra
2016-05-18 18:23 ` Oleg Nesterov
2016-05-18 19:10 ` Peter Zijlstra
2016-05-18 19:57 ` Oleg Nesterov
2016-05-26 11:34 ` Peter Zijlstra
2016-06-03 10:49 ` [tip:sched/core] sched/fair: Use task_rcu_dereference() tip-bot for Oleg Nesterov
2016-06-03 10:48 ` [tip:sched/core] sched/api: Introduce task_rcu_dereference() and try_get_task_struct() tip-bot for Oleg Nesterov
2014-10-28 11:02 ` [tip:sched/core] sched/numa: Fix unsafe get_task_struct() in task_numa_assign() tip-bot for Kirill Tkhai
2014-11-08 3:48 ` [PATCH v4] sched/numa: fix " Sasha Levin
2014-11-09 14:07 ` Kirill Tkhai [this message]
2014-11-10 10:03 ` Peter Zijlstra
2014-11-10 15:48 ` Sasha Levin
2014-11-10 16:01 ` Peter Zijlstra
2014-11-16 9:50 ` [tip:sched/urgent] sched/numa: Avoid selecting oneself as swap target tip-bot for Peter Zijlstra
2014-11-10 16:03 ` [PATCH v4] sched/numa: fix unsafe get_task_struct() in task_numa_assign() Peter Zijlstra
2014-11-10 16:09 ` Sasha Levin
2014-11-10 16:16 ` Peter Zijlstra
2014-11-10 16:10 ` Kirill Tkhai
2014-11-10 16:36 ` Kirill Tkhai
2014-11-10 16:44 ` Sasha Levin
2014-11-10 20:01 ` Kirill Tkhai
2014-11-12 9:49 ` Kirill Tkhai
2014-11-15 2:38 ` Sasha Levin
2014-11-18 17:30 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1415542020.474.22.camel@tkhai \
--to=ktkhai@parallels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=sasha.levin@oracle.com \
--cc=tkhai@yandex.ru \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.