public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Miao Xie <miaox@cn.fujitsu.com>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Ingo Molnar <mingo@elte.hu>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Avi Kivity <avi@qumranet.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUG] CFS vs cpu hotplug
Date: Mon, 07 Jul 2008 18:26:17 +0800	[thread overview]
Message-ID: <4871EF49.6000501@cn.fujitsu.com> (raw)
In-Reply-To: <486B490C.3090902@cn.fujitsu.com>

on 3:59 Lai Jiangshan wrote:
> Dmitry Adamushko wrote:
>> 2008/7/2 Lai Jiangshan <laijs@cn.fujitsu.com>:
>>> Ingo Molnar wrote:
>>>> * Lai Jiangshan <laijs@cn.fujitsu.com> wrote:
>>>>
>>>>> The following oops still occurred whether this patch is applied or not.
>>>>>  [<ffffffff8059372c>] notifier_call_chain+0x33/0x5b
>>>>>  [<ffffffff802476a9>] __raw_notifier_call_chain+0x9/0xb
>>>>>  [<ffffffff802476ba>] raw_notifier_call_chain+0xf/0x11
>>>>>  [<ffffffff805736d6>] _cpu_down+0x191/0x256
>>>>>  [<ffffffff805737c1>] cpu_down+0x26/0x36
>>>>>  [<ffffffff805749c1>] store_online+0x32/0x75
>>>>>  [<ffffffff803d1982>] sysdev_store+0x24/0x26
>>>>>  [<ffffffff802d2551>] sysfs_write_file+0xe0/0x11c
>>>>>  [<ffffffff80290e6b>] vfs_write+0xae/0x137
>>>>>  [<ffffffff802913d3>] sys_write+0x47/0x70
>>>>>  [<ffffffff8020b1eb>] system_call_after_swapgs+0x7b/0x80
>>>> hm, there were multiple problems in this area and a lot of dormant bugs.
>>>> Do you have this recent upstream commit in your tree:
>>> Hi, Ingo
>>>        I tested it again with the most recent upstreams(including the
>>> following patch) committed, the oops still occurred.
>> [ taken from the oops ]
>>> kernel BUG at kernel/sched.c:6133!
>>>
[snip]
>> We should see then all tasks that have been migrated (or failed to be
>> migrated) during migration_call(CPU_DEAD, ...).
>>
> Thank you. I'll test it again with your debugging patch applied
> and get more info.

I tested it with Dmitry's patch, and found that all the tasks on the offline
cpu were migrated to an online cpu by migrate_live_tasks() in migration_call().
But some tasks(such as klogd and so on)was moved back to the offline cpu
immediately before BUG_ON(rq->nr_running != 0) checking, even before acquiring
rq's lock.

	static int __cpuinit
	migration_call(struct notifier_block *nfb, unsigned long action, void *
	{
		...
		switch (action) {
		...
		case CPU_DEAD:
		case CPU_DEAD_FROZEN:
			cpuset_lock();
			migrate_live_tasks(cpu);
			rq = cpu_rq(cpu);
			...
			spin_lock_irq(&rq->lock);
			...
			migrate_dead_tasks(cpu);
			spin_unlock_irq(&rq->lock);
			cpuset_unlock();
			migrate_nr_uninterruptible(rq);
			BUG_ON(rq->nr_running != 0);
			...
			break;
		}
		...
	}

By debuging, I found this bug was caused by select_task_rq_fair().
After migrating the tasks on the offline cpu to an online cpu, the kernel would
wake up these migrated tasks quickly by try_to_wake_up(). try_to_wake_up() would
invoke select_task_rq_fair() to find a lower-load cpu in sched domains for them.
But the sched domains weren't updated and the offline cpu was still in the sched
domains. So select_task_rq_fair() might return the offline cpu's id, then the
bug occurred.

I fix the bug just by checking the select_task_rq_fair()'s return value in
try_to_wake_up().

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>

---
 kernel/sched.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 94ead43..15b5ddf 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -2103,6 +2103,9 @@ static int try_to_wake_up(struct task_struct *p, unsigned int state, int sync)
 		goto out_activate;
 
 	cpu = p->sched_class->select_task_rq(p, sync);
+	if (unlikely(cpu_is_offline(cpu)))
+		cpu = orig_cpu;
+
 	if (cpu != orig_cpu) {
 		set_task_cpu(p, cpu);
 		task_rq_unlock(rq, &flags);
-- 
1.5.4.rc3



  reply	other threads:[~2008-07-07 10:28 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-19 16:19 [BUG] CFS vs cpu hotplug Heiko Carstens
2008-06-19 18:05 ` Peter Zijlstra
2008-06-19 18:14   ` Peter Zijlstra
2008-06-19 21:14     ` Heiko Carstens
2008-06-19 21:26       ` Peter Zijlstra
2008-06-19 21:17   ` Heiko Carstens
2008-06-19 21:32   ` Peter Zijlstra
2008-06-19 21:49     ` Heiko Carstens
2008-06-20  8:51       ` Peter Zijlstra
2008-06-20 22:19         ` Heiko Carstens
2008-06-20 11:44   ` Dmitry Adamushko
2008-06-20 22:23     ` Heiko Carstens
2008-06-25 22:12 ` Dmitry Adamushko
2008-06-28 22:16   ` Dmitry Adamushko
2008-06-29  6:55     ` Ingo Molnar
2008-06-30  9:07     ` Heiko Carstens
2008-06-30  9:17       ` Ingo Molnar
2008-07-01  9:22         ` Lai Jiangshan
2008-07-01  9:31           ` Ingo Molnar
2008-07-01 10:09             ` Lai Jiangshan
2008-07-02  7:13             ` Lai Jiangshan
2008-07-02  8:50               ` Dmitry Adamushko
2008-07-02  9:23                 ` Lai Jiangshan
2008-07-07 10:26                   ` Miao Xie [this message]
2008-07-07 11:31                     ` Dmitry Adamushko
  -- strict thread matches above, loose matches on Subject: below --
2008-07-09 22:32 Dmitry Adamushko
2008-07-10  7:30 ` Heiko Carstens
2008-07-10  7:39   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4871EF49.6000501@cn.fujitsu.com \
    --to=miaox@cn.fujitsu.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=avi@qumranet.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox