public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Avi Kivity <avi@qumranet.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUG] CFS vs cpu hotplug
Date: Tue, 01 Jul 2008 17:22:56 +0800	[thread overview]
Message-ID: <4869F770.6050103@cn.fujitsu.com> (raw)
In-Reply-To: <20080630091711.GA26637@elte.hu>

Ingo Molnar wrote:
> * Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> 
>> On Sun, Jun 29, 2008 at 12:16:56AM +0200, Dmitry Adamushko wrote:
>>> Hello,
>>>
>>>
>>> it seems to be related to migrate_dead_tasks().
>>>
>>> Firstly I added traces to see all tasks being migrated with
>>> migrate_live_tasks() and migrate_dead_tasks(). On my setup the problem
>>> pops up (the one with "se == NULL" in the loop of
>>> pick_next_task_fair()) shortly after the traces indicate that some has
>>> been migrated with migrate_dead_tasks()). btw., I can reproduce it
>>> much faster now with just a plain cpu down/up loop.
>>>
>>> [disclaimer] Well, unless I'm really missing something important in
>>> this late hour [/desclaimer] pick_next_task() is not something
>>> appropriate for migrate_dead_tasks() :-)
>>>
>>> the following change seems to eliminate the problem on my setup
>>> (although, I kept it running only for a few minutes to get a few
>>> messages indicating migrate_dead_tasks() does move tasks and the
>>> system is still ok)
>>>
>>> [ quick hack ]
>>>
>>> @@ -5887,6 +5907,7 @@ static void migrate_dead_tasks(unsigned int dead_cpu)
>>>                 next = pick_next_task(rq, rq->curr);
>>>                 if (!next)
>>>                         break;
>>> +               next->sched_class->put_prev_task(rq, next);
>>>                 migrate_dead(dead_cpu, next);
>>>
>>>         }
>> Thanks Dmitry! With your patch I cannot reproduce the bug anymore.
> 
> thanks - it passed my testing too. It's lined up for v2.6.26 merge, in 
> tip/sched/urgent.
> 
> Avi, does this patch fix your CPU hotplug problems too?
> 
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 

Hi, Ingo

The following oops still occurred whether this patch is applied or not.

Lai Jiangshan


------------[ cut here ]------------
kernel BUG at kernel/sched.c:6133!
invalid opcode: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 4744, comm: cpu_online.sh Not tainted 2.6.26-rc8 #1
RIP: 0010:[<ffffffff8058d0a9>]  [<ffffffff8058d0a9>] migration_call+0x3eb/0x494
RSP: 0018:ffff81007115fd28  EFLAGS: 00010202
RAX: ffffffffffffffe3 RBX: ffff810001017580 RCX: 000000801b7c6e42
RDX: ffff81007115fcf8 RSI: 0000009388d2771c RDI: ffff810001017e00
RBP: ffff81007115fd78 R08: ffff81007115e000 R09: ffff8100807d6000
R10: ffff81007fb6d050 R11: 00000000ffffffff R12: 0000000000000283
R13: ffff810001029580 R14: ffff810001029580 R15: 0000000000000002
FS:  00007fbb153d36f0(0000) GS:ffffffff807a3000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fabafe2b0a8 CR3: 0000000076901000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cpu_online.sh (pid: 4744, threadinfo ffff81007115e000, task ffff810071447200)
Stack:  ffff81007115e000 000000007115fbd8 00000000ffffffff 0000000000000002
 ffff81007115fd78 0000000000000000 00000000ffffffff ffffffff807a1d40
 0000000000000002 0000000000000007 ffff81007115fdb8 ffffffff8059372c
Call Trace:
 [<ffffffff8059372c>] notifier_call_chain+0x33/0x5b
 [<ffffffff802476a9>] __raw_notifier_call_chain+0x9/0xb
 [<ffffffff802476ba>] raw_notifier_call_chain+0xf/0x11
 [<ffffffff805736d6>] _cpu_down+0x191/0x256
 [<ffffffff805737c1>] cpu_down+0x26/0x36
 [<ffffffff805749c1>] store_online+0x32/0x75
 [<ffffffff803d1982>] sysdev_store+0x24/0x26
 [<ffffffff802d2551>] sysfs_write_file+0xe0/0x11c
 [<ffffffff80290e6b>] vfs_write+0xae/0x137
 [<ffffffff802913d3>] sys_write+0x47/0x70
 [<ffffffff8020b1eb>] system_call_after_swapgs+0x7b/0x80


Code: 80 07 00 00 48 01 83 80 07 00 00 49 c7 85 80 07 00 00 00 00 00 00 41 fe 45 00 49 39 dd 74 02 fe 03 41 54 9d 49 83 7d 08 00 74 04 <0f> 0b eb fe 4c 89 ef e8 b8 40 00 00 eb 1e 48 8b 11 48 8b 41 08
RIP  [<ffffffff8058d0a9>] migration_call+0x3eb/0x494
 RSP <ffff81007115fd28>
---[ end trace f22fd757d4f07850 ]---

platform: x86_64 2cores*2cpus fedora9
# cat cpu_online.sh
#!/bin/sh

cpu1=1
cpu2=1
cpu3=1
while ((1))
do
        no=$(($RANDOM % 3 + 1))
        if ((!cpu$no))
        then
                echo 1 > /sys/devices/system/cpu/cpu$no/online
                ((cpu$no=1))
        else
                echo 0 > /sys/devices/system/cpu/cpu$no/online
                ((cpu$no=0))
        fi
        echo 1 $cpu1 $cpu2 $cpu3
        sleep 2
done



  reply	other threads:[~2008-07-01  9:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-19 16:19 [BUG] CFS vs cpu hotplug Heiko Carstens
2008-06-19 18:05 ` Peter Zijlstra
2008-06-19 18:14   ` Peter Zijlstra
2008-06-19 21:14     ` Heiko Carstens
2008-06-19 21:26       ` Peter Zijlstra
2008-06-19 21:17   ` Heiko Carstens
2008-06-19 21:32   ` Peter Zijlstra
2008-06-19 21:49     ` Heiko Carstens
2008-06-20  8:51       ` Peter Zijlstra
2008-06-20 22:19         ` Heiko Carstens
2008-06-20 11:44   ` Dmitry Adamushko
2008-06-20 22:23     ` Heiko Carstens
2008-06-25 22:12 ` Dmitry Adamushko
2008-06-28 22:16   ` Dmitry Adamushko
2008-06-29  6:55     ` Ingo Molnar
2008-06-30  9:07     ` Heiko Carstens
2008-06-30  9:17       ` Ingo Molnar
2008-07-01  9:22         ` Lai Jiangshan [this message]
2008-07-01  9:31           ` Ingo Molnar
2008-07-01 10:09             ` Lai Jiangshan
2008-07-02  7:13             ` Lai Jiangshan
2008-07-02  8:50               ` Dmitry Adamushko
2008-07-02  9:23                 ` Lai Jiangshan
2008-07-07 10:26                   ` Miao Xie
2008-07-07 11:31                     ` Dmitry Adamushko
  -- strict thread matches above, loose matches on Subject: below --
2008-07-09 22:32 Dmitry Adamushko
2008-07-10  7:30 ` Heiko Carstens
2008-07-10  7:39   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4869F770.6050103@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=avi@qumranet.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox