All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Avi Kivity <avi@qumranet.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUG] CFS vs cpu hotplug
Date: Tue, 01 Jul 2008 17:22:56 +0800	[thread overview]
Message-ID: <4869F770.6050103@cn.fujitsu.com> (raw)
In-Reply-To: <20080630091711.GA26637@elte.hu>

Ingo Molnar wrote:
> * Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> 
>> On Sun, Jun 29, 2008 at 12:16:56AM +0200, Dmitry Adamushko wrote:
>>> Hello,
>>>
>>>
>>> it seems to be related to migrate_dead_tasks().
>>>
>>> Firstly I added traces to see all tasks being migrated with
>>> migrate_live_tasks() and migrate_dead_tasks(). On my setup the problem
>>> pops up (the one with "se == NULL" in the loop of
>>> pick_next_task_fair()) shortly after the traces indicate that some has
>>> been migrated with migrate_dead_tasks()). btw., I can reproduce it
>>> much faster now with just a plain cpu down/up loop.
>>>
>>> [disclaimer] Well, unless I'm really missing something important in
>>> this late hour [/desclaimer] pick_next_task() is not something
>>> appropriate for migrate_dead_tasks() :-)
>>>
>>> the following change seems to eliminate the problem on my setup
>>> (although, I kept it running only for a few minutes to get a few
>>> messages indicating migrate_dead_tasks() does move tasks and the
>>> system is still ok)
>>>
>>> [ quick hack ]
>>>
>>> @@ -5887,6 +5907,7 @@ static void migrate_dead_tasks(unsigned int dead_cpu)
>>>                 next = pick_next_task(rq, rq->curr);
>>>                 if (!next)
>>>                         break;
>>> +               next->sched_class->put_prev_task(rq, next);
>>>                 migrate_dead(dead_cpu, next);
>>>
>>>         }
>> Thanks Dmitry! With your patch I cannot reproduce the bug anymore.
> 
> thanks - it passed my testing too. It's lined up for v2.6.26 merge, in 
> tip/sched/urgent.
> 
> Avi, does this patch fix your CPU hotplug problems too?
> 
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 

Hi, Ingo

The following oops still occurred whether this patch is applied or not.

Lai Jiangshan


------------[ cut here ]------------
kernel BUG at kernel/sched.c:6133!
invalid opcode: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 4744, comm: cpu_online.sh Not tainted 2.6.26-rc8 #1
RIP: 0010:[<ffffffff8058d0a9>]  [<ffffffff8058d0a9>] migration_call+0x3eb/0x494
RSP: 0018:ffff81007115fd28  EFLAGS: 00010202
RAX: ffffffffffffffe3 RBX: ffff810001017580 RCX: 000000801b7c6e42
RDX: ffff81007115fcf8 RSI: 0000009388d2771c RDI: ffff810001017e00
RBP: ffff81007115fd78 R08: ffff81007115e000 R09: ffff8100807d6000
R10: ffff81007fb6d050 R11: 00000000ffffffff R12: 0000000000000283
R13: ffff810001029580 R14: ffff810001029580 R15: 0000000000000002
FS:  00007fbb153d36f0(0000) GS:ffffffff807a3000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fabafe2b0a8 CR3: 0000000076901000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process cpu_online.sh (pid: 4744, threadinfo ffff81007115e000, task ffff810071447200)
Stack:  ffff81007115e000 000000007115fbd8 00000000ffffffff 0000000000000002
 ffff81007115fd78 0000000000000000 00000000ffffffff ffffffff807a1d40
 0000000000000002 0000000000000007 ffff81007115fdb8 ffffffff8059372c
Call Trace:
 [<ffffffff8059372c>] notifier_call_chain+0x33/0x5b
 [<ffffffff802476a9>] __raw_notifier_call_chain+0x9/0xb
 [<ffffffff802476ba>] raw_notifier_call_chain+0xf/0x11
 [<ffffffff805736d6>] _cpu_down+0x191/0x256
 [<ffffffff805737c1>] cpu_down+0x26/0x36
 [<ffffffff805749c1>] store_online+0x32/0x75
 [<ffffffff803d1982>] sysdev_store+0x24/0x26
 [<ffffffff802d2551>] sysfs_write_file+0xe0/0x11c
 [<ffffffff80290e6b>] vfs_write+0xae/0x137
 [<ffffffff802913d3>] sys_write+0x47/0x70
 [<ffffffff8020b1eb>] system_call_after_swapgs+0x7b/0x80


Code: 80 07 00 00 48 01 83 80 07 00 00 49 c7 85 80 07 00 00 00 00 00 00 41 fe 45 00 49 39 dd 74 02 fe 03 41 54 9d 49 83 7d 08 00 74 04 <0f> 0b eb fe 4c 89 ef e8 b8 40 00 00 eb 1e 48 8b 11 48 8b 41 08
RIP  [<ffffffff8058d0a9>] migration_call+0x3eb/0x494
 RSP <ffff81007115fd28>
---[ end trace f22fd757d4f07850 ]---

platform: x86_64 2cores*2cpus fedora9
# cat cpu_online.sh
#!/bin/sh

cpu1=1
cpu2=1
cpu3=1
while ((1))
do
        no=$(($RANDOM % 3 + 1))
        if ((!cpu$no))
        then
                echo 1 > /sys/devices/system/cpu/cpu$no/online
                ((cpu$no=1))
        else
                echo 0 > /sys/devices/system/cpu/cpu$no/online
                ((cpu$no=0))
        fi
        echo 1 $cpu1 $cpu2 $cpu3
        sleep 2
done



  reply	other threads:[~2008-07-01  9:24 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-19 16:19 [BUG] CFS vs cpu hotplug Heiko Carstens
2008-06-19 18:05 ` Peter Zijlstra
2008-06-19 18:14   ` Peter Zijlstra
2008-06-19 21:14     ` Heiko Carstens
2008-06-19 21:26       ` Peter Zijlstra
2008-06-19 21:17   ` Heiko Carstens
2008-06-19 21:32   ` Peter Zijlstra
2008-06-19 21:49     ` Heiko Carstens
2008-06-20  8:51       ` Peter Zijlstra
2008-06-20 22:19         ` Heiko Carstens
2008-06-20 11:44   ` Dmitry Adamushko
2008-06-20 22:23     ` Heiko Carstens
2008-06-25 22:12 ` Dmitry Adamushko
2008-06-28 22:16   ` Dmitry Adamushko
2008-06-29  6:55     ` Ingo Molnar
2008-06-30  9:07     ` Heiko Carstens
2008-06-30  9:17       ` Ingo Molnar
2008-07-01  9:22         ` Lai Jiangshan [this message]
2008-07-01  9:31           ` Ingo Molnar
2008-07-01 10:09             ` Lai Jiangshan
2008-07-02  7:13             ` Lai Jiangshan
2008-07-02  8:50               ` Dmitry Adamushko
2008-07-02  9:23                 ` Lai Jiangshan
2008-07-07 10:26                   ` Miao Xie
2008-07-07 11:31                     ` Dmitry Adamushko
  -- strict thread matches above, loose matches on Subject: below --
2008-07-09 22:32 Dmitry Adamushko
2008-07-10  7:30 ` Heiko Carstens
2008-07-10  7:39   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4869F770.6050103@cn.fujitsu.com \
    --to=laijs@cn.fujitsu.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=avi@qumranet.com \
    --cc=dmitry.adamushko@gmail.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.