[BUG] cpu hotplug vs scheduler

All of lore.kernel.org
 help / color / mirror / Atom feed

* [BUG] cpu hotplug vs scheduler
@ 2008-05-13 14:33 Avi Kivity
  2008-05-13 15:33 ` Avi Kivity
  2008-05-14  8:13 ` Dmitry Adamushko
  0 siblings, 2 replies; 18+ messages in thread
From: Avi Kivity @ 2008-05-13 14:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar

I'm testing host cpu hotplug with kvm.  Basically running 7 guests on a 
4 core machine, offlining and onlining host cpus at random.  Eventually 
I hit this:

[4298303.496645] Booting processor 3/7 ip 6000
[4298303.506116] Initializing CPU#3
[4298303.506116] Calibrating delay using timer specific routine.. 
5319.66 BogoMIPS (lpj=2659833)
[4298303.506116] CPU: L1 I cache: 32K, L1 D cache: 32K
[4298303.506116] CPU: L2 cache: 4096K
[4298303.506116] CPU: Physical Processor ID: 3
[4298303.506116] CPU: Processor Core ID: 1
[4298303.506116] x86 PAT enabled: cpu 3, old 0x7040600070406, new 
0x7010600070106
[4298303.582937] CPU3: Intel(R) Xeon(R) CPU            5150  @ 2.66GHz 
stepping 06
[4298303.585087] checking TSC synchronization [CPU#0 -> CPU#3]: passed.
[4298303.707287] Switched to high resolution mode on CPU 3
[4298303.712943] kvm: enabling virtualization on CPU3
[4298303.713955] CPU0 attaching sched-domain:
[4298303.713901] BUG: unable to handle kernel NULL pointer dereference 
at 0000000000000158
[4298303.713901] IP: [<ffffffff8022e722>] pick_next_task_fair+0x55/0x7c
[4298303.713901] PGD 0
[4298303.713901] Oops: 0000 [1] PREEMPT SMP
[4298303.713901] CPU 3
[4298303.713901] Modules linked in: kvm_intel kvm netconsole autofs4 nfs 
lockd nfs_acl sunrpc bridge llc acpi_cpufreq backlight sg e1000 button 
serio_raw rtc_cmos rtc_core rtc_lib ata_piix dm_snapshot dm_mod ahci 
libata dock sd_mod scsi_mod [last unloaded: kvm]
[4298303.713901] Pid: 15115, comm: migration/3 Not tainted 2.6.26-rc2 #723
[4298303.713901] RIP: 0010:[<ffffffff8022e722>]  [<ffffffff8022e722>] 
pick_next_task_fair+0x55/0x7c
[4298303.713901] RSP: 0018:ffff81004fdfbe20  EFLAGS: 00010046
[4298303.713901] RAX: 0000000000000000 RBX: ffff81000103df80 RCX: 
0000000000000000
[4298303.713901] RDX: ffff81000103e038 RSI: 000000003b9aca00 RDI: 
ffff81000103df00
[4298303.713901] RBP: ffff81004fdfbe40 R08: ffff81004fdfbdd0 R09: 
ffff81000103a0a0
[4298303.713901] R10: 0000000000000000 R11: 0000000000000003 R12: 
0000000000000000
[4298303.713901] R13: ffff81000103df00 R14: ffffffff8060a140 R15: 
0000000000000003
[4298303.713901] FS:  0000000000000000(0000) GS:ffff81007f806a80(0000) 
knlGS:0000000000000000
[4298303.713901] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[4298303.713901] CR2: 0000000000000158 CR3: 0000000000201000 CR4: 
00000000000026a0
[4298303.713901] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[4298303.713901] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[4298303.713901] Process migration/3 (pid: 15115, threadinfo 
ffff81004fdfa000, task ffff81003f5f0700)
[4298303.713901] Stack:  ffff81004fdfbe40 ffff81000103df00 
ffffffff804543c0 ffff810020985bf8
[4298303.713901]  ffff81004fdfbee0 ffffffff804373fe ffff81004fdfbe80 
ffffffff8023060a
[4298303.713901]  ffff81000103df00 ffff81003f5f0700 0000000000000000 
ffff81003f5f0a60
[4298303.713901] Call Trace:
[4298303.713901]  [<ffffffff804373fe>] schedule+0x414/0x6ab
[4298303.713901]  [<ffffffff8023060a>] ? hrtick_set+0x9d/0xe8
[4298303.713901]  [<ffffffff8043772f>] ? thread_return+0x9a/0xbf
[4298303.713901]  [<ffffffff80231652>] migration_thread+0x185/0x22d
[4298303.713901]  [<ffffffff802314cd>] ? migration_thread+0x0/0x22d
[4298303.713901]  [<ffffffff8024afe6>] kthread+0x49/0x77
[4298303.713901]  [<ffffffff8020d228>] child_rip+0xa/0x12
[4298303.713901]  [<ffffffff8024af9d>] ? kthread+0x0/0x77
[4298303.713901]  [<ffffffff8020d21e>] ? child_rip+0x0/0x12
[4298303.713901]
[4298303.713901]
[4298303.713901] Code: c0 74 28 48 8b 7b 58 4c 8d 60 f0 48 85 ff 74 10 
4c 89 e6 e8 df cc ff ff 85 c0 75 04 4c 8b 63 58 4c 89 e6 48 89 df e8 4a 
e5 ff ff <49> 8b 9c 24 58 01 00 00 48 85 db 75 bf 49 83 ec 38 4c 89 ef 4c
[4298303.713901] RIP  [<ffffffff8022e722>] pick_next_task_fair+0x55/0x7c

This seems to be the assignment to cfs_rq after pick_next_entity().

I'm running kvm.git, which is currently 2.6.26-rc2 plus a few kvm 
patches.  It could be kvm's fault, but it doesn't appear so from the traces.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-13 14:33 [BUG] cpu hotplug vs scheduler Avi Kivity
@ 2008-05-13 15:33 ` Avi Kivity
  2008-05-13 19:00   ` Heiko Carstens
  2008-05-14  8:13 ` Dmitry Adamushko
  1 sibling, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2008-05-13 15:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar

Avi Kivity wrote:
> I'm testing host cpu hotplug with kvm.  Basically running 7 guests on 
> a 4 core machine, offlining and onlining host cpus at random.  
> Eventually I hit this:
>

[snip oops]

> This seems to be the assignment to cfs_rq after pick_next_entity().
>
> I'm running kvm.git, which is currently 2.6.26-rc2 plus a few kvm 
> patches.  It could be kvm's fault, but it doesn't appear so from the 
> traces.
>

I get the same oops running with the kvm modules unloaded.  None of 
kvm.git's changes are in core code, so this is definitely a mainline 
problem.

It took about 40 minutes to reproduce; 1200 cpus onlined and 1200 cpus 
offlined.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-13 15:33 ` Avi Kivity
@ 2008-05-13 19:00   ` Heiko Carstens
  0 siblings, 0 replies; 18+ messages in thread
From: Heiko Carstens @ 2008-05-13 19:00 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, Ingo Molnar

On Tue, May 13, 2008 at 06:33:00PM +0300, Avi Kivity wrote:
> Avi Kivity wrote:
>> I'm testing host cpu hotplug with kvm.  Basically running 7 guests on  
>> a 4 core machine, offlining and onlining host cpus at random.   
>> Eventually I hit this:
>>
>
> [snip oops]
>
>> This seems to be the assignment to cfs_rq after pick_next_entity().
>>
>> I'm running kvm.git, which is currently 2.6.26-rc2 plus a few kvm  
>> patches.  It could be kvm's fault, but it doesn't appear so from the  
>> traces.
>>
>
> I get the same oops running with the kvm modules unloaded.  None of  
> kvm.git's changes are in core code, so this is definitely a mainline  
> problem.
>
> It took about 40 minutes to reproduce; 1200 cpus onlined and 1200 cpus  
> offlined.

FWIW, this happens on s390 as well (without kvm support). But until now
I didn't have time to look into it. Didn't report it here yet, since we
had plenty of cpu hotplug changes in arch code, so I suspected a bug there.
This doesn't seem to be the case.

It takes only a few minutes to trigger the bug here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-13 14:33 [BUG] cpu hotplug vs scheduler Avi Kivity
  2008-05-13 15:33 ` Avi Kivity
@ 2008-05-14  8:13 ` Dmitry Adamushko
  2008-05-14 12:30   ` Avi Kivity
  2008-05-21 14:48   ` [BUG] hotplug cpus on ia64 Cliff Wickman
  1 sibling, 2 replies; 18+ messages in thread
From: Dmitry Adamushko @ 2008-05-14  8:13 UTC (permalink / raw)
  To: Avi Kivity
  Cc: linux-kernel, Ingo Molnar, Heiko Carstens, Peter Zijlstra,
	Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 2720 bytes --]

Hi,

> [ ... ]
>
>  [4298303.713901] Call Trace:
>  [4298303.713901]  [<ffffffff804373fe>] schedule+0x414/0x6ab
>  [4298303.713901]  [<ffffffff8023060a>] ? hrtick_set+0x9d/0xe8
>  [4298303.713901]  [<ffffffff8043772f>] ? thread_return+0x9a/0xbf
>  [4298303.713901]  [<ffffffff80231652>] migration_thread+0x185/0x22d
>  [4298303.713901]  [<ffffffff802314cd>] ? migration_thread+0x0/0x22d
>  [4298303.713901]  [<ffffffff8024afe6>] kthread+0x49/0x77
>  [4298303.713901]  [<ffffffff8020d228>] child_rip+0xa/0x12
>  [4298303.713901]  [<ffffffff8024af9d>] ? kthread+0x0/0x77
>  [4298303.713901]  [<ffffffff8020d21e>] ? child_rip+0x0/0x12
>  [4298303.713901]
>  [4298303.713901]
>  [4298303.713901] Code: c0 74 28 48 8b 7b 58 4c 8d 60 f0 48 85 ff 74 10 4c
> 89 e6 e8 df cc ff ff 85 c0 75 04 4c 8b 63 58 4c 89 e6 48 89 df e8 4a e5 ff
> ff <49> 8b 9c 24 58 01 00 00 48 85 db 75 bf 49 83 ec 38 4c 89 ef 4c
>  [4298303.713901] RIP  [<ffffffff8022e722>] pick_next_task_fair+0x55/0x7c
>
>  This seems to be the assignment to cfs_rq after pick_next_entity().

[ cc'ed a few folks. ]


So the cfs-tree likely gets out-of-sync. I pressume, it won't be
reproducible with CONFIG_SCHED_GROUP options being disabled.

Anyway, would you try one of these debug-patches (not sure about the
workability of the second one though :-/)

Let's check what are the values for 'cfs_rq->weight.load/nr_running'.

thanks in advance,

(non-whitespace-damaged versions are enclosed)

---
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1291,6 +1291,12 @@ static struct task_struct
*pick_next_task_fair(struct rq *rq)

        do {
                se = pick_next_entity(cfs_rq);
+
+               if (unlikely(!se))
+                       printk(KERN_ERR "BUG: se == NULL but
nr_running (%ld), load (%ld),"
+                                       " rq-nr_running (%ld), rq-load (%ld)\n",
+                               cfs_rq->nr_running,
cfs_rq->load.weight, rq->nr_running, rq->load.weight);
+
                cfs_rq = group_cfs_rq(se);
        } while (cfs_rq);


---

--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1280,6 +1280,8 @@ static void check_preempt_wakeup(struct rq *rq,
struct task_str uct *p)
                resched_task(curr);
 }

+static void sysrq_sched_debug_show(void);
+
 static struct task_struct *pick_next_task_fair(struct rq *rq)
 {
        struct task_struct *p;
@@ -1291,6 +1293,10 @@ static struct task_struct
*pick_next_task_fair(struct rq *rq)

        do {
                se = pick_next_entity(cfs_rq);
+
+               if (unlikely(!se))
+                       sysrq_sched_debug_show();
+
                cfs_rq = group_cfs_rq(se);
        } while (cfs_rq);


---

-- 
Best regards,
Dmitry Adamushko

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: cfs_rq-debug-2.patch --]
[-- Type: text/x-patch; name=cfs_rq-debug-2.patch, Size: 630 bytes --]

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index e24ecd3..1dcc470 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1280,6 +1280,8 @@ static void check_preempt_wakeup(struct rq *rq, struct task_struct *p)
 		resched_task(curr);
 }
 
+static void sysrq_sched_debug_show(void);
+
 static struct task_struct *pick_next_task_fair(struct rq *rq)
 {
 	struct task_struct *p;
@@ -1291,6 +1293,10 @@ static struct task_struct *pick_next_task_fair(struct rq *rq)
 
 	do {
 		se = pick_next_entity(cfs_rq);
+
+		if (unlikely(!se))
+			sysrq_sched_debug_show();
+
 		cfs_rq = group_cfs_rq(se);
 	} while (cfs_rq);
 

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #3: cfs_rq-debug-3.patch --]
[-- Type: text/x-patch; name=cfs_rq-debug-3.patch, Size: 542 bytes --]

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index e24ecd3..e21e020 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -1291,6 +1291,12 @@ static struct task_struct *pick_next_task_fair(struct rq *rq)
 
 	do {
 		se = pick_next_entity(cfs_rq);
+
+		if (unlikely(!se))
+			printk(KERN_ERR "BUG: se == NULL but nr_running (%ld), load (%ld),"
+					" rq-nr_running (%ld), rq-load (%ld)\n",
+				cfs_rq->nr_running, cfs_rq->load.weight, rq->nr_running, rq->load.weight);
+
 		cfs_rq = group_cfs_rq(se);
 	} while (cfs_rq);
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-14  8:13 ` Dmitry Adamushko
@ 2008-05-14 12:30   ` Avi Kivity
  2008-05-14 13:05     ` Dmitry Adamushko
  2008-05-21 14:48   ` [BUG] hotplug cpus on ia64 Cliff Wickman
  1 sibling, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2008-05-14 12:30 UTC (permalink / raw)
  To: Dmitry Adamushko
  Cc: linux-kernel, Ingo Molnar, Heiko Carstens, Peter Zijlstra,
	Srivatsa Vaddagiri

Dmitry Adamushko wrote:
> Hi,
>
>   
>> [ ... ]
>>
>>  [4298303.713901] Call Trace:
>>  [4298303.713901]  [<ffffffff804373fe>] schedule+0x414/0x6ab
>>  [4298303.713901]  [<ffffffff8023060a>] ? hrtick_set+0x9d/0xe8
>>  [4298303.713901]  [<ffffffff8043772f>] ? thread_return+0x9a/0xbf
>>  [4298303.713901]  [<ffffffff80231652>] migration_thread+0x185/0x22d
>>  [4298303.713901]  [<ffffffff802314cd>] ? migration_thread+0x0/0x22d
>>  [4298303.713901]  [<ffffffff8024afe6>] kthread+0x49/0x77
>>  [4298303.713901]  [<ffffffff8020d228>] child_rip+0xa/0x12
>>  [4298303.713901]  [<ffffffff8024af9d>] ? kthread+0x0/0x77
>>  [4298303.713901]  [<ffffffff8020d21e>] ? child_rip+0x0/0x12
>>  [4298303.713901]
>>  [4298303.713901]
>>  [4298303.713901] Code: c0 74 28 48 8b 7b 58 4c 8d 60 f0 48 85 ff 74 10 4c
>> 89 e6 e8 df cc ff ff 85 c0 75 04 4c 8b 63 58 4c 89 e6 48 89 df e8 4a e5 ff
>> ff <49> 8b 9c 24 58 01 00 00 48 85 db 75 bf 49 83 ec 38 4c 89 ef 4c
>>  [4298303.713901] RIP  [<ffffffff8022e722>] pick_next_task_fair+0x55/0x7c
>>
>>  This seems to be the assignment to cfs_rq after pick_next_entity().
>>     
>
> [ cc'ed a few folks. ]
>
>
> So the cfs-tree likely gets out-of-sync. I pressume, it won't be
> reproducible with CONFIG_SCHED_GROUP options being disabled.
>
> Anyway, would you try one of these debug-patches (not sure about the
> workability of the second one though :-/)
>
> Let's check what are the values for 'cfs_rq->weight.load/nr_running'.
>
>   

Got this for the first patch:

[4302727.615522] Booting processor 3/7 ip 6000
[4302727.625923] Initializing CPU#3
[4302727.625923] Calibrating delay using timer specific routine.. 
5319.76 BogoMIPS (lpj=2659883)
[4302727.625923] CPU: L1 I cache: 32K, L1 D cache: 32K
[4302727.625923] CPU: L2 cache: 4096K
[4302727.625923] CPU: Physical Processor ID: 3
[4302727.625923] CPU: Processor Core ID: 1
[4302727.625923] x86 PAT enabled: cpu 3, old 0x7040600070406, new 
0x7010600070106
[4302727.692484] CPU3: Intel(R) Xeon(R) CPU            5150  @ 2.66GHz 
stepping 06
[4302727.694236] checking TSC synchronization [CPU#1 -> CPU#3]: passed.
[4302727.824185] Switched to high resolution mode on CPU 3
[4302727.859184] kvm: enabling virtualization on CPU3
[4302727.859714] Sched Debug Version: v0.07, 2.6.26-rc2 #726
[4302727.859714] now at 6918576.148656 msecs
[4302727.859714]   .sysctl_sched_latency                    : 60.000000
[4302727.859714]   .sysctl_sched_min_granularity            : 12.000000
[4302727.859714]   .sysctl_sched_wakeup_granularity         : 30.000000
[4302727.859714]   .sysctl_sched_child_runs_first           : 0.000001
[4302727.860191]   .sysctl_sched_features                   : 895
[4302727.860191]
[4302727.860191] cpu#0, 2659.999 MHz
[4302727.860191]   .nr_running                    : 2
[4302727.860191]   .load                          : 841
[4302727.860191]   .nr_switches                   : 3427530
[4302727.861205]   .nr_load_updates               : 2183358
[4302727.861205]   .nr_uninterruptible            : 15
[4302727.861205]   .jiffies                       : 4301585875
[4302727.861205]   .next_balance                  : 4301.585696
[4302727.861205]   .curr->pid                     : 4678
[4302727.861205]   .clock                         : 6918579.002757
[4302727.862216]   .cpu_load[0]                   : 841
[4302727.862216]   .cpu_load[1]                   : 841
[4302727.862216]   .cpu_load[2]                   : 841
[4302727.862216]   .cpu_load[3]                   : 841
[4302727.862216]   .cpu_load[4]                   : 841
[4302727.862216]
[4302727.862216] cfs_rq[0]:
[4302727.867209]   .exec_clock                    : 3970.569663
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 5178969.408050
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : 0.000000
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 513
[4302727.867209]   .nr_spread_over                : 6
[4302727.867209]   .shares                        : 1024
[4302727.867209]
[4302727.867209] cfs_rq[0]:
[4302727.867209]   .exec_clock                    : 14.588517
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 5178971.405628
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : 0.000000
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 513
[4302727.867209]   .nr_spread_over                : 0
[4302727.867209]   .shares                        : 1024
[4302727.867209]
[4302727.867209] cfs_rq[0]:
[4302727.867209]   .exec_clock                    : 41.615870
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 5178973.403544
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : 0.000000
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 513
[4302727.867209]   .nr_spread_over                : 2
[4302727.867209]   .shares                        : 1024
[4302727.867209]
[4302727.867209] cfs_rq[0]:
[4302727.867209]   .exec_clock                    : 0.000000
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 5178975.401320
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : 0.000000
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 513
[4302727.867209]   .nr_spread_over                : 0
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867209] cfs_rq[0]:
[4302727.867209]   .exec_clock                    : 0.000001
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 5178977.398314
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : 0.000000
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 513
[4302727.867209]   .nr_spread_over                : 0
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867209] cfs_rq[0]:
[4302727.867209]   .exec_clock                    : 2165242.484786
[4302727.867209]   .MIN_vruntime                  : 10323214.742376
[4302727.867209]   .min_vruntime                  : 5178979.396488
[4302727.867209]   .max_vruntime                  : 10323214.742376
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : 0.000000
[4302727.867209]   .nr_running                    : 2
[4302727.867209]   .load                          : 2048
[4302727.867209]   .bkl_count                     : 513
[4302727.867209]   .nr_spread_over                : 1789825
[4302727.867209]   .shares                        : 843
[4302727.867209]
[4302727.867209] runnable tasks:
[4302727.867209]             task   PID         tree-key  switches  
prio     exec-runtime         sum-exec        sum-sleep
[4302727.867209] 
----------------------------------------------------------------------------------------------------------
[4302727.867209] Rqemu-system-x86  4678  10323337.578253    553310   
120  10323337.578255   1380505.796830     42439.250368
[4302727.867209]
[4302727.867209] cpu#1, 2659.999 MHz
[4302727.867209]   .nr_running                    : 3
[4302727.867209]   .load                          : 415
[4302727.867209]   .nr_switches                   : 629498
[4302727.867209]   .nr_load_updates               : 838874
[4302727.867209]   .nr_uninterruptible            : -6
[4302727.867209]   .jiffies                       : 4301585895
[4302727.867209]   .next_balance                  : 4301.585634
[4302727.867209]   .curr->pid                     : 7799
[4302727.867209]   .clock                         : 6918576.130865
[4302727.867209]   .cpu_load[0]                   : 415
[4302727.867209]   .cpu_load[1]                   : 415
[4302727.867209]   .cpu_load[2]                   : 415
[4302727.867209]   .cpu_load[3]                   : 415
[4302727.867209]   .cpu_load[4]                   : 415
[4302727.867209]
[4302727.867209] cfs_rq[1]:
[4302727.867209]   .exec_clock                    : 74.637431
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 759396.868495
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : -4419588.520858
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 299
[4302727.867209]   .nr_spread_over                : 1
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867209] cfs_rq[1]:
[4302727.867209]   .exec_clock                    : 22.707771
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 759396.868495
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : -4419590.518446
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 299
[4302727.867209]   .nr_spread_over                : 1
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867209] cfs_rq[1]:
[4302727.867209]   .exec_clock                    : 0.033026
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 759396.868495
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : -4419590.518446
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 299
[4302727.867209]   .nr_spread_over                : 0
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867209] cfs_rq[1]:
[4302727.867209]   .exec_clock                    : 0.000000
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 759396.868495
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : -4419590.518446
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 299
[4302727.867209]   .nr_spread_over                : 0
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867209] cfs_rq[1]:
[4302727.867209]   .exec_clock                    : 0.026450
[4302727.867209]   .MIN_vruntime                  : 0.000001
[4302727.867209]   .min_vruntime                  : 759396.868495
[4302727.867209]   .max_vruntime                  : 0.000001
[4302727.867209]   .spread                        : 0.000000
[4302727.867209]   .spread0                       : -4419590.518446
[4302727.867209]   .nr_running                    : 0
[4302727.867209]   .load                          : 0
[4302727.867209]   .bkl_count                     : 299
[4302727.867209]   .nr_spread_over                : 0
[4302727.867209]   .shares                        : 0
[4302727.867209]
[4302727.867210] cfs_rq[1]:
[4302727.867210]   .exec_clock                    : 754981.092689
[4302727.867210]   .MIN_vruntime                  : 1239813.449102
[4302727.867210]   .min_vruntime                  : 759396.868495
[4302727.867210]   .max_vruntime                  : 1239819.334711
[4302727.867210]   .spread                        : 5.885609
[4302727.867210]   .spread0                       : -4419590.518446
[4302727.867210]   .nr_running                    : 3
[4302727.867210]   .load                          : 3072
[4302727.867210]   .bkl_count                     : 299
[4302727.867210]   .nr_spread_over                : 53817
[4302727.867210]   .shares                        : 415
[4302727.867210]
[4302727.867210] runnable tasks:
[4302727.867210]             task   PID         tree-key  switches  
prio     exec-runtime         sum-exec        sum-sleep
[4302727.867210] 
----------------------------------------------------------------------------------------------------------
[4302727.900184]  qemu-system-x86  4987   1239813.449102    709828   
120   1239813.449102   1410504.949783     22865.206608
[4302727.900184]  qemu-system-x86  5052   1239819.334711    530481   
120   1239819.334711   1365146.519564     50937.064744
[4302727.900184] Rtoggle-processo  7799   1239811.208673     47886   
120   1239811.208673     57552.631854   1592798.974913
[4302727.900184]
[4302727.900184] cpu#3, 2659.999 MHz
[4302727.900184]   .nr_running                    : 1
[4302727.900184]   .load                          : 285
[4302727.900184]   .nr_switches                   : 611209
[4302727.900184]   .nr_load_updates               : 843051
[4302727.900184]   .nr_uninterruptible            : -2
[4302727.900184]   .jiffies                       : 4301585916
[4302727.900184]   .next_balance                  : 4301.586873
[4302727.900184]   .curr->pid                     : 0
[4302727.900184]   .clock                         : 6918576.376068
[4302727.900184]   .cpu_load[0]                   : 0
[4302727.900184]   .cpu_load[1]                   : 0
[4302727.900184]   .cpu_load[2]                   : 0
[4302727.900184]   .cpu_load[3]                   : 181
[4302727.900184]   .cpu_load[4]                   : 1108
[4302727.900184]
[4302727.900184] cfs_rq[3]:
[4302727.900184]   .exec_clock                    : 8.224765
[4302727.900184] BUG: spinlock recursion on CPU#3, swapper/0
[4302727.900184]  lock: ffff81000103df00, .magic: dead4ead, .owner: 
swapper/0, .owner_cpu: 3
[4302727.900184] Pid: 0, comm: swapper Not tainted 2.6.26-rc2 #726
[4302727.900184]
[4302727.900184] Call Trace:
[4302727.900184]  [<ffffffff803249de>] spin_bug+0x9e/0xe9
[4302727.900184]  [<ffffffff80324af4>] _raw_spin_lock+0x41/0x123
[4302727.900184]  [<ffffffff80439638>] _spin_lock_irqsave+0x2f/0x37
[4302727.900184]  [<ffffffff8022ef7c>] print_cfs_rq+0xca/0x46a
[4302727.900184]  [<ffffffff80231f97>] sched_debug_show+0x7a3/0xb8c
[4302727.900184]  [<ffffffff8023238d>] sysrq_sched_debug_show+0xd/0xf
[4302727.900184]  [<ffffffff802323ee>] pick_next_task_fair+0x5f/0x86
[4302727.900184]  [<ffffffff804373f6>] schedule+0x3fc/0x6ab
[4302727.900184]  [<ffffffff8024e03f>] ? ktime_get_ts+0x49/0x4e
[4302727.900184]  [<ffffffff80253a28>] ? tick_nohz_stop_idle+0x2d/0x54
[4302727.900184]  [<ffffffff8021283f>] ? mwait_idle+0x0/0x59
[4302727.900184]  [<ffffffff8020ae37>] cpu_idle+0xc8/0xd7
[4302727.900184]  [<ffffffff804332f1>] start_secondary+0x173/0x178
[4302727.900184]




-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-14 12:30   ` Avi Kivity
@ 2008-05-14 13:05     ` Dmitry Adamushko
  2008-05-15 10:19       ` Avi Kivity
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Adamushko @ 2008-05-14 13:05 UTC (permalink / raw)
  To: Avi Kivity
  Cc: linux-kernel, Ingo Molnar, Heiko Carstens, Peter Zijlstra,
	Srivatsa Vaddagiri

[-- Attachment #1: Type: text/plain, Size: 2368 bytes --]

2008/5/14 Avi Kivity <avi@qumranet.com>:

> [ ... ]
>
>  [4302727.900184] Call Trace:
>  [4302727.900184]  [<ffffffff803249de>] spin_bug+0x9e/0xe9
>  [4302727.900184]  [<ffffffff80324af4>] _raw_spin_lock+0x41/0x123
>  [4302727.900184]  [<ffffffff80439638>] _spin_lock_irqsave+0x2f/0x37
>  [4302727.900184]  [<ffffffff8022ef7c>] print_cfs_rq+0xca/0x46a
>  [4302727.900184]  [<ffffffff80231f97>] sched_debug_show+0x7a3/0xb8c
>  [4302727.900184]  [<ffffffff8023238d>] sysrq_sched_debug_show+0xd/0xf
>  [4302727.900184]  [<ffffffff802323ee>] pick_next_task_fair+0x5f/0x86


Err... sorry for the broken patch. The patch below on top of the
previous one should address this issue (ugly, but should be ok for
debugging).
'tasklist_lock' shouldn't cause a double lock, I guess.

Sorry for rather 'blind' attempts. If no, then I'll
prepare/test/take-a-closer-look at it later today when I'm at home.

TIA,


------ kernel/sched_debug-prev.c   2008-05-14 14:53:28.000000000 +0200
+++ kernel/sched_debug.c        2008-05-14 14:58:12.000000000 +0200
@@ -125,6 +125,7 @@ void print_cfs_rq(struct seq_file *m, in
        char path[128] = "";
        struct cgroup *cgroup = NULL;
        struct task_group *tg = cfs_rq->tg;
+       int was_locked;

        if (tg)
                cgroup = tg->css.cgroup;
@@ -138,7 +139,11 @@ void print_cfs_rq(struct seq_file *m, in
        SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "exec_clock",
                        SPLIT_NS(cfs_rq->exec_clock));

-       spin_lock_irqsave(&rq->lock, flags);
+       was_locked = spin_is_locked(&rq->lock);
+
+       if (!was_locked)
+               spin_lock_irqsave(&rq->lock, flags);
+
        if (cfs_rq->rb_leftmost)
                MIN_vruntime = (__pick_next_entity(cfs_rq))->vruntime;
        last = __pick_last_entity(cfs_rq);
@@ -146,7 +151,10 @@ void print_cfs_rq(struct seq_file *m, in
                max_vruntime = last->vruntime;
        min_vruntime = rq->cfs.min_vruntime;
        rq0_min_vruntime = per_cpu(runqueues, 0).cfs.min_vruntime;
-       spin_unlock_irqrestore(&rq->lock, flags);
+
+       if (!was_locked)
+               spin_unlock_irqrestore(&rq->lock, flags);
+
        SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "MIN_vruntime",
                        SPLIT_NS(MIN_vruntime));
        SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "min_vruntime",

---


-- 
Best regards,
Dmitry Adamushko

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: is_locked.diff --]
[-- Type: text/x-patch; name=is_locked.diff, Size: 1211 bytes --]

--- kernel/sched_debug-prev.c	2008-05-14 14:53:28.000000000 +0200
+++ kernel/sched_debug.c	2008-05-14 14:58:12.000000000 +0200
@@ -125,6 +125,7 @@ void print_cfs_rq(struct seq_file *m, in
 	char path[128] = "";
 	struct cgroup *cgroup = NULL;
 	struct task_group *tg = cfs_rq->tg;
+	int was_locked;
 
 	if (tg)
 		cgroup = tg->css.cgroup;
@@ -138,7 +139,11 @@ void print_cfs_rq(struct seq_file *m, in
 	SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "exec_clock",
 			SPLIT_NS(cfs_rq->exec_clock));
 
-	spin_lock_irqsave(&rq->lock, flags);
+	was_locked = spin_is_locked(&rq->lock);
+
+	if (!was_locked)
+		spin_lock_irqsave(&rq->lock, flags);
+	
 	if (cfs_rq->rb_leftmost)
 		MIN_vruntime = (__pick_next_entity(cfs_rq))->vruntime;
 	last = __pick_last_entity(cfs_rq);
@@ -146,7 +151,10 @@ void print_cfs_rq(struct seq_file *m, in
 		max_vruntime = last->vruntime;
 	min_vruntime = rq->cfs.min_vruntime;
 	rq0_min_vruntime = per_cpu(runqueues, 0).cfs.min_vruntime;
-	spin_unlock_irqrestore(&rq->lock, flags);
+
+	if (!was_locked)
+		spin_unlock_irqrestore(&rq->lock, flags);
+
 	SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "MIN_vruntime",
 			SPLIT_NS(MIN_vruntime));
 	SEQ_printf(m, "  .%-30s: %Ld.%06ld\n", "min_vruntime",

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-14 13:05     ` Dmitry Adamushko
@ 2008-05-15 10:19       ` Avi Kivity
  2008-05-21 12:31         ` Heiko Carstens
  0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2008-05-15 10:19 UTC (permalink / raw)
  To: Dmitry Adamushko
  Cc: linux-kernel, Ingo Molnar, Heiko Carstens, Peter Zijlstra,
	Srivatsa Vaddagiri

Dmitry Adamushko wrote:
> 2008/5/14 Avi Kivity <avi@qumranet.com>:
>
>   
>> [ ... ]
>>
>>  [4302727.900184] Call Trace:
>>  [4302727.900184]  [<ffffffff803249de>] spin_bug+0x9e/0xe9
>>  [4302727.900184]  [<ffffffff80324af4>] _raw_spin_lock+0x41/0x123
>>  [4302727.900184]  [<ffffffff80439638>] _spin_lock_irqsave+0x2f/0x37
>>  [4302727.900184]  [<ffffffff8022ef7c>] print_cfs_rq+0xca/0x46a
>>  [4302727.900184]  [<ffffffff80231f97>] sched_debug_show+0x7a3/0xb8c
>>  [4302727.900184]  [<ffffffff8023238d>] sysrq_sched_debug_show+0xd/0xf
>>  [4302727.900184]  [<ffffffff802323ee>] pick_next_task_fair+0x5f/0x86
>>     
>
>
> Err... sorry for the broken patch. The patch below on top of the
> previous one should address this issue (ugly, but should be ok for
> debugging).
> 'tasklist_lock' shouldn't cause a double lock, I guess.
>
> Sorry for rather 'blind' attempts. If no, then I'll
> prepare/test/take-a-closer-look at it later today when I'm at home.
>
>   

Now it died when allocating the pda:

[4301986.526627] Booting processor 1/6 ip 6000
[4301986.537195] swapper: page allocation failure. order:2, mode:0x20
[4301986.537195] Pid: 0, comm: swapper Not tainted 2.6.26-rc2 #728
[4301986.537195]
[4301986.537195] Call Trace:
[4301986.537195]  [<ffffffff802388ae>] ? __printk_ratelimit+0x9/0xb
[4301986.537195]  [<ffffffff80272e07>] __alloc_pages_internal+0x387/0x39f
[4301986.537195]  [<ffffffff80272e35>] __alloc_pages+0xb/0xd
[4301986.537195]  [<ffffffff80272e4c>] __get_free_pages+0x15/0x46
[4301986.537195]  [<ffffffff80210cdf>] pda_init+0x87/0xac
[4301986.537195]  [<ffffffff8042fed1>] cpu_init+0x5e/0x34a
[4301986.537195]  [<ffffffff8026b437>] ? rcu_needs_cpu+0x36/0x47
[4301986.537195]  [<ffffffff802311a7>] ? idle_task_exit+0x71/0xad
[4301986.537195]  [<ffffffff804331aa>] start_secondary+0xc/0x178
[4301986.537195]
[4301986.537195] Mem-info:
[4301986.537195] DMA per-cpu:
[4301986.537195] CPU    0: hi:    0, btch:   1 usd:   0
[4301986.537195] CPU    2: hi:    0, btch:   1 usd:   0
[4301986.537195] DMA32 per-cpu:
[4301986.537195] CPU    0: hi:  186, btch:  31 usd: 153
[4301986.537195] CPU    2: hi:  186, btch:  31 usd: 135
[4301986.537195] Active:347879 inactive:109278 dirty:107 writeback:0 
unstable:0
[4301986.537195]  free:5311 slab:10186 mapped:30496 pagetables:1587 bounce:0
[4301986.537195] DMA free:8012kB min:24kB low:28kB high:36kB active:60kB 
inactive:0kB present:8644kB pages_scanned:0 all_unreclaimable? no
[4301986.537195] lowmem_reserve[]: 0 1999 1999 1999
[4301986.537195] DMA32 free:13232kB min:5704kB low:7128kB high:8556kB 
active:1391456kB inactive:437112kB present:2047248kB pages_scanned:0 
all_unreclaimable? no
[4301986.537195] lowmem_reserve[]: 0 0 0 0
[4301986.537195] DMA: 19*4kB 16*8kB 10*16kB 11*32kB 6*64kB 4*128kB 
1*256kB 2*512kB 3*1024kB 1*2048kB 0*4096kB = 8012kB
[4301986.537195] DMA32: 2034*4kB 570*8kB 1*16kB 0*32kB 0*64kB 0*128kB 
0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 13224kB
[4301986.537195] 212750 total pagecache pages
[4301986.537195] Swap cache: add 73, delete 18, find 0/0
[4301986.537195] Free swap  = 3210964kB
[4301986.537195] Total swap = 3211256kB

There seem to be some order-2 allocations available, though.  And in any 
case, cpu hotplug should't crash on such failures, only fail.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-15 10:19       ` Avi Kivity
@ 2008-05-21 12:31         ` Heiko Carstens
  2008-05-21 12:42           ` Avi Kivity
  0 siblings, 1 reply; 18+ messages in thread
From: Heiko Carstens @ 2008-05-21 12:31 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Dmitry Adamushko, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Srivatsa Vaddagiri

> Now it died when allocating the pda:
>
> [4301986.526627] Booting processor 1/6 ip 6000
> [4301986.537195] swapper: page allocation failure. order:2, mode:0x20
> [4301986.537195] Pid: 0, comm: swapper Not tainted 2.6.26-rc2 #728
> [4301986.537195]
> [4301986.537195] Call Trace:
> [4301986.537195]  [<ffffffff802388ae>] ? __printk_ratelimit+0x9/0xb
> [4301986.537195]  [<ffffffff80272e07>] __alloc_pages_internal+0x387/0x39f
> [4301986.537195]  [<ffffffff80272e35>] __alloc_pages+0xb/0xd
> [4301986.537195]  [<ffffffff80272e4c>] __get_free_pages+0x15/0x46
> [4301986.537195]  [<ffffffff80210cdf>] pda_init+0x87/0xac
> [4301986.537195]  [<ffffffff8042fed1>] cpu_init+0x5e/0x34a
> [4301986.537195]  [<ffffffff8026b437>] ? rcu_needs_cpu+0x36/0x47
> [4301986.537195]  [<ffffffff802311a7>] ? idle_task_exit+0x71/0xad
> [4301986.537195]  [<ffffffff804331aa>] start_secondary+0xc/0x178
> [4301986.537195]
> [4301986.537195] Mem-info:
> [4301986.537195] DMA per-cpu:
> [4301986.537195] CPU    0: hi:    0, btch:   1 usd:   0
> [4301986.537195] CPU    2: hi:    0, btch:   1 usd:   0
> [4301986.537195] DMA32 per-cpu:
> [4301986.537195] CPU    0: hi:  186, btch:  31 usd: 153
> [4301986.537195] CPU    2: hi:  186, btch:  31 usd: 135
> [4301986.537195] Active:347879 inactive:109278 dirty:107 writeback:0  
> unstable:0
> [4301986.537195]  free:5311 slab:10186 mapped:30496 pagetables:1587 bounce:0
> [4301986.537195] DMA free:8012kB min:24kB low:28kB high:36kB active:60kB  
> inactive:0kB present:8644kB pages_scanned:0 all_unreclaimable? no
> [4301986.537195] lowmem_reserve[]: 0 1999 1999 1999
> [4301986.537195] DMA32 free:13232kB min:5704kB low:7128kB high:8556kB  
> active:1391456kB inactive:437112kB present:2047248kB pages_scanned:0  
> all_unreclaimable? no
> [4301986.537195] lowmem_reserve[]: 0 0 0 0
> [4301986.537195] DMA: 19*4kB 16*8kB 10*16kB 11*32kB 6*64kB 4*128kB  
> 1*256kB 2*512kB 3*1024kB 1*2048kB 0*4096kB = 8012kB
> [4301986.537195] DMA32: 2034*4kB 570*8kB 1*16kB 0*32kB 0*64kB 0*128kB  
> 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 13224kB
> [4301986.537195] 212750 total pagecache pages
> [4301986.537195] Swap cache: add 73, delete 18, find 0/0
> [4301986.537195] Free swap  = 3210964kB
> [4301986.537195] Total swap = 3211256kB
>
> There seem to be some order-2 allocations available, though.  And in any  
> case, cpu hotplug should't crash on such failures, only fail.

Hi Avi,

I think you fixed this specific bug. Does cpu hotplug stresstest now work
for you?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-21 12:31         ` Heiko Carstens
@ 2008-05-21 12:42           ` Avi Kivity
  2008-05-21 12:55             ` Heiko Carstens
  0 siblings, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2008-05-21 12:42 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Dmitry Adamushko, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Srivatsa Vaddagiri

Heiko Carstens wrote:
>> Now it died when allocating the pda:
>>
>> [4301986.526627] Booting processor 1/6 ip 6000
>> [4301986.537195] swapper: page allocation failure. order:2, mode:0x20
>> [4301986.537195] Pid: 0, comm: swapper Not tainted 2.6.26-rc2 #728
>> [4301986.537195]
>> [4301986.537195] Call Trace:
>> [4301986.537195]  [<ffffffff802388ae>] ? __printk_ratelimit+0x9/0xb
>> [4301986.537195]  [<ffffffff80272e07>] __alloc_pages_internal+0x387/0x39f
>> [4301986.537195]  [<ffffffff80272e35>] __alloc_pages+0xb/0xd
>> [4301986.537195]  [<ffffffff80272e4c>] __get_free_pages+0x15/0x46
>> [4301986.537195]  [<ffffffff80210cdf>] pda_init+0x87/0xac
>> [4301986.537195]  [<ffffffff8042fed1>] cpu_init+0x5e/0x34a
>> [4301986.537195]  [<ffffffff8026b437>] ? rcu_needs_cpu+0x36/0x47
>> [4301986.537195]  [<ffffffff802311a7>] ? idle_task_exit+0x71/0xad
>> [4301986.537195]  [<ffffffff804331aa>] start_secondary+0xc/0x178
>> [4301986.537195]
>> [4301986.537195] Mem-info:
>> [4301986.537195] DMA per-cpu:
>> [4301986.537195] CPU    0: hi:    0, btch:   1 usd:   0
>> [4301986.537195] CPU    2: hi:    0, btch:   1 usd:   0
>> [4301986.537195] DMA32 per-cpu:
>> [4301986.537195] CPU    0: hi:  186, btch:  31 usd: 153
>> [4301986.537195] CPU    2: hi:  186, btch:  31 usd: 135
>> [4301986.537195] Active:347879 inactive:109278 dirty:107 writeback:0  
>> unstable:0
>> [4301986.537195]  free:5311 slab:10186 mapped:30496 pagetables:1587 bounce:0
>> [4301986.537195] DMA free:8012kB min:24kB low:28kB high:36kB active:60kB  
>> inactive:0kB present:8644kB pages_scanned:0 all_unreclaimable? no
>> [4301986.537195] lowmem_reserve[]: 0 1999 1999 1999
>> [4301986.537195] DMA32 free:13232kB min:5704kB low:7128kB high:8556kB  
>> active:1391456kB inactive:437112kB present:2047248kB pages_scanned:0  
>> all_unreclaimable? no
>> [4301986.537195] lowmem_reserve[]: 0 0 0 0
>> [4301986.537195] DMA: 19*4kB 16*8kB 10*16kB 11*32kB 6*64kB 4*128kB  
>> 1*256kB 2*512kB 3*1024kB 1*2048kB 0*4096kB = 8012kB
>> [4301986.537195] DMA32: 2034*4kB 570*8kB 1*16kB 0*32kB 0*64kB 0*128kB  
>> 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 13224kB
>> [4301986.537195] 212750 total pagecache pages
>> [4301986.537195] Swap cache: add 73, delete 18, find 0/0
>> [4301986.537195] Free swap  = 3210964kB
>> [4301986.537195] Total swap = 3211256kB
>>
>> There seem to be some order-2 allocations available, though.  And in any  
>> case, cpu hotplug should't crash on such failures, only fail.
>>     
>
> Hi Avi,
>
> I think you fixed this specific bug. Does cpu hotplug stresstest now work
> for you?
>   

Which bug, the order-2 allocs?  I certainly didn't fix it.

Last I tried, cpu hotplug still failed, but it was some time ago as 
patches flow.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-21 12:42           ` Avi Kivity
@ 2008-05-21 12:55             ` Heiko Carstens
  2008-05-21 13:03               ` Avi Kivity
  0 siblings, 1 reply; 18+ messages in thread
From: Heiko Carstens @ 2008-05-21 12:55 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Dmitry Adamushko, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Srivatsa Vaddagiri

On Wed, May 21, 2008 at 03:42:51PM +0300, Avi Kivity wrote:
> Heiko Carstens wrote:
>> I think you fixed this specific bug. Does cpu hotplug stresstest now work
>> for you?
>
> Which bug, the order-2 allocs?  I certainly didn't fix it.
>
> Last I tried, cpu hotplug still failed, but it was some time ago as  
> patches flow.

Ah no. I thought "x86: fix crash on cpu hotplug on pat-incapable machines"
would have fixed the cpu hotplug bug for you.

I was actually talking of the pick_next_task_fair incarnation of one of
the cpu hotplug bugs.

s390 backtrace on 2.6.25 looks like this.

    <1>Unable to handle kernel pointer dereference at virtual kernel address 00000200000db000
    <4>Oops: 003b [#1] PREEMPT SMP 
    <4>Modules linked in: dm_multipath sunrpc qeth_l2 dm_mod cu3088 qeth ccwgroup
    <4>CPU: 0 Not tainted 2.6.25-27.x.20080422-s390xdefault #1
    <4>Process migration/0 (pid: 5806, task: 00000000aac3a838, ksp: 000000017a2f3dd0)
    <4>Krnl PSW : 0400200180000000 0000000000125f48 (pick_next_task_fair+0x34/0xd4)
    <4>           R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:2 PM:0 EA:3
    <4>Krnl GPRS: 0000079e029c6ecb 0000000000000000 000000008548ea60 000000008548d700
    <4>           00000000004526f8 0000000000000597 0000000000000000 0000000000000000
    <4>           0000000000672a00 0000000000000000 00000000aac3a838 000000017a2f3d10
    <4>           00000200000db85a 00000000004672a0 000000017a2f3d50 000000017a2f3d10
    <4>Krnl Code: 0000000000125f38: e3e0f0980024        stg     %r14,152(%r15)
    <4>           0000000000125f3e: d507d000c010        clc     0(8,%r13),16(%r12)
    <4>           0000000000125f44: a784004e            brc     8,125fe0
    <4>          >0000000000125f48: d507d000c030        clc     0(8,%r13),48(%r12)
    <4>           0000000000125f4e: b904002c            lgr     %r2,%r12
    <4>           0000000000125f52: a7990000            lghi    %r9,0
    <4>           0000000000125f56: a7840033            brc     8,125fbc
    <4>           0000000000125f5a: c0e5ffffee23        brasl   %r14,123ba0
    <4>Call Trace:
    <4>([<00000000aac3a838>] 0xaac3a838)
    <4> [<000000000045270c>] schedule+0x178/0x978
    <4> [<000000000012fe86>] migration_thread+0x1aa/0x30c
    <4> [<0000000000152ad8>] kthread+0x68/0xa0
    <4> [<000000000010a0be>] kernel_thread_starter+0x6/0xc
    <4> [<000000000010a0b8>] kernel_thread_starter+0x0/0xc
    <4>Last Breaking-Event-Address:
    <4> [<0000000000125fc6>] pick_next_task_fair+0xb2/0xd4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] cpu hotplug vs scheduler
  2008-05-21 12:55             ` Heiko Carstens
@ 2008-05-21 13:03               ` Avi Kivity
  0 siblings, 0 replies; 18+ messages in thread
From: Avi Kivity @ 2008-05-21 13:03 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Dmitry Adamushko, linux-kernel, Ingo Molnar, Peter Zijlstra,
	Srivatsa Vaddagiri

Heiko Carstens wrote:
> On Wed, May 21, 2008 at 03:42:51PM +0300, Avi Kivity wrote:
>   
>> Heiko Carstens wrote:
>>     
>>> I think you fixed this specific bug. Does cpu hotplug stresstest now work
>>> for you?
>>>       
>> Which bug, the order-2 allocs?  I certainly didn't fix it.
>>
>> Last I tried, cpu hotplug still failed, but it was some time ago as  
>> patches flow.
>>     
>
> Ah no. I thought "x86: fix crash on cpu hotplug on pat-incapable machines"
> would have fixed the cpu hotplug bug for you.
>
>   

No, that only fixes an immediate oops when running in a virtual machine 
(which doesn't have pat).

> I was actually talking of the pick_next_task_fair incarnation of one of
> the cpu hotplug bugs.
>
>   

That hasn't been addressed, AFAIK.

> s390 backtrace on 2.6.25 looks like this.
>
>     <1>Unable to handle kernel pointer dereference at virtual kernel address 00000200000db000
>     <4>Oops: 003b [#1] PREEMPT SMP 
>     <4>Modules linked in: dm_multipath sunrpc qeth_l2 dm_mod cu3088 qeth ccwgroup
>     <4>CPU: 0 Not tainted 2.6.25-27.x.20080422-s390xdefault #1
>     <4>Process migration/0 (pid: 5806, task: 00000000aac3a838, ksp: 000000017a2f3dd0)
>     <4>Krnl PSW : 0400200180000000 0000000000125f48 (pick_next_task_fair+0x34/0xd4)
>     <4>           R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:0 CC:2 PM:0 EA:3
>     <4>Krnl GPRS: 0000079e029c6ecb 0000000000000000 000000008548ea60 000000008548d700
>     <4>           00000000004526f8 0000000000000597 0000000000000000 0000000000000000
>     <4>           0000000000672a00 0000000000000000 00000000aac3a838 000000017a2f3d10
>     <4>           00000200000db85a 00000000004672a0 000000017a2f3d50 000000017a2f3d10
>     <4>Krnl Code: 0000000000125f38: e3e0f0980024        stg     %r14,152(%r15)
>     <4>           0000000000125f3e: d507d000c010        clc     0(8,%r13),16(%r12)
>     <4>           0000000000125f44: a784004e            brc     8,125fe0
>     <4>          >0000000000125f48: d507d000c030        clc     0(8,%r13),48(%r12)
>     <4>           0000000000125f4e: b904002c            lgr     %r2,%r12
>     <4>           0000000000125f52: a7990000            lghi    %r9,0
>     <4>           0000000000125f56: a7840033            brc     8,125fbc
>     <4>           0000000000125f5a: c0e5ffffee23        brasl   %r14,123ba0
>     <4>Call Trace:
>     <4>([<00000000aac3a838>] 0xaac3a838)
>     <4> [<000000000045270c>] schedule+0x178/0x978
>     <4> [<000000000012fe86>] migration_thread+0x1aa/0x30c
>     <4> [<0000000000152ad8>] kthread+0x68/0xa0
>     <4> [<000000000010a0be>] kernel_thread_starter+0x6/0xc
>     <4> [<000000000010a0b8>] kernel_thread_starter+0x0/0xc
>     <4>Last Breaking-Event-Address:
>     <4> [<0000000000125fc6>] pick_next_task_fair+0xb2/0xd4
>   

Seems to be the same bug, yes.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [BUG] hotplug cpus on ia64
  2008-05-14  8:13 ` Dmitry Adamushko
  2008-05-14 12:30   ` Avi Kivity
@ 2008-05-21 14:48   ` Cliff Wickman
  1 sibling, 0 replies; 18+ messages in thread
From: Cliff Wickman @ 2008-05-21 14:48 UTC (permalink / raw)
  To: dmitry.adamushko, avi, mingo, heiko.carstens, a.p.zijlstra, vatsa
  Cc: Avi Kivity, linux-kernel, Ingo Molnar, Heiko Carstens,
	PeterZijlst

Gentlemen,

  I built an ia64 kernel from Andrew's tree (2.6.26-rc2-mm1)
and get a very predictable hotplug cpu problem.

billberry1:/tmp/cpw # ./dis
disabled cpu 17
enabled cpu 17
billberry1:/tmp/cpw # ./dis
disabled cpu 17
enabled cpu 17
billberry1:/tmp/cpw # ./dis

The script that disables the cpu always hangs (unkillable)
on the 3rd attempt.
I haven't spent any debugging time on it yet.

Just wondering if you've seen it?   It doesn't seem to happen x86_64.

-Cliff Wickman
cpw@sgi.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] hotplug cpus on ia64
       [not found] ` <1212154614.12349.244.camel@twins>
@ 2008-06-03 22:17   ` Cliff Wickman
  2008-06-04 13:50     ` Dimitri Sivanich
  2008-06-05 12:49     ` Peter Zijlstra
  0 siblings, 2 replies; 18+ messages in thread
From: Cliff Wickman @ 2008-06-03 22:17 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: sivanich, linux-kernel


On Fri, May 30, 2008 at 03:36:54PM +0200, Peter Zijlstra wrote:
> On Thu, 2008-05-29 at 11:32 -0500, Cliff Wickman wrote:
> > >> I built an ia64 kernel from Andrew's tree (2.6.26-rc2-mm1)
> > >> and get a very predictable hotplug cpu problem.
> > >> billberry1:/tmp/cpw # ./dis
> > >> disabled cpu 17
> > >> enabled cpu 17
> > >> billberry1:/tmp/cpw # ./dis
> > >> disabled cpu 17
> > >> enabled cpu 17
> > >> billberry1:/tmp/cpw # ./dis
> > >> 
> > >> The script that disables the cpu always hangs (unkillable)
> > >> on the 3rd attempt.
> > 
> > > And a bit further:
> > > The kstopmachine thread always sits on the run queue (real time) for about
> > > 30 minutes before running.
> > 
> > And a bit further:
> > 
> > The kstopmachine thread is queued as real-time on the downed cpu:
> > >> rq -f 17
> > CPU#       runq address size Lock     current task         time name
> > ==========================================================================
> >   17 0xe000046003059540    3  U 0xe0000360f06f8000            0 swapper
> >         Total of 3 queued:
> >         3 real time tasks:  px *(rt_rq *)0xe000046003059608
> >         exclusive queue:
> >         slot 0
> >         0xe0000760f4628000            0 migration/17
> >         0xe0000760f4708000            0 kstopmachine
> >         0xe0000760f6678000            0 watchdog/17
> > 
> > I put in counters and see that schedule() is never again entered by cpu 17
> > after it is downed the 3rd time.
> > (it is entered after being up'd the first two times)
> > 
> > The kstopmachine thread is bound to cpu 17 by __stop_machine_run()'s call
> > to kthread_bind().
> > 
> > A cpu does not schedule after being downed, of course.  But it does again
> > after being up'd.  
> > Why would the second up be different?  Following it, if the cpu is
> > downed it never schedules again.
> > 
> > If I always bind kstopmachine to cpu 0 the problem disappears.
> 
> does:
> 
> echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> 
> fix the problem?

Yes! It does. 

Dimitri Sivanich has run into what looks like a similar problem.
Hope the above workaround is a good clue to its solution.

-- 
Cliff Wickman
Silicon Graphics, Inc.
cpw@sgi.com
(651) 683-3824

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] hotplug cpus on ia64
  2008-06-03 22:17   ` Cliff Wickman
@ 2008-06-04 13:50     ` Dimitri Sivanich
  2008-06-05 12:49     ` Peter Zijlstra
  1 sibling, 0 replies; 18+ messages in thread
From: Dimitri Sivanich @ 2008-06-04 13:50 UTC (permalink / raw)
  To: Cliff Wickman; +Cc: Peter Zijlstra, linux-kernel

On Tue, Jun 03, 2008 at 05:17:59PM -0500, Cliff Wickman wrote:
> 
> On Fri, May 30, 2008 at 03:36:54PM +0200, Peter Zijlstra wrote:
> > On Thu, 2008-05-29 at 11:32 -0500, Cliff Wickman wrote:
> > > I put in counters and see that schedule() is never again entered by cpu 17
> > > after it is downed the 3rd time.
> > > (it is entered after being up'd the first two times)
> > > 
> > > The kstopmachine thread is bound to cpu 17 by __stop_machine_run()'s call
> > > to kthread_bind().
> > > 
> > > A cpu does not schedule after being downed, of course.  But it does again
> > > after being up'd.  
> > > Why would the second up be different?  Following it, if the cpu is
> > > downed it never schedules again.
> > > 
> > > If I always bind kstopmachine to cpu 0 the problem disappears.
> > 
> > does:
> > 
> > echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> > 
> > fix the problem?
> 
> Yes! It does. 
> 
> Dimitri Sivanich has run into what looks like a similar problem.
> Hope the above workaround is a good clue to its solution.

This fixes the problem I was seeing as well.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] hotplug cpus on ia64
  2008-06-03 22:17   ` Cliff Wickman
  2008-06-04 13:50     ` Dimitri Sivanich
@ 2008-06-05 12:49     ` Peter Zijlstra
  2008-06-05 13:51       ` Dimitri Sivanich
  2008-06-10 10:19       ` Ingo Molnar
  1 sibling, 2 replies; 18+ messages in thread
From: Peter Zijlstra @ 2008-06-05 12:49 UTC (permalink / raw)
  To: Cliff Wickman; +Cc: sivanich, linux-kernel

On Tue, 2008-06-03 at 17:17 -0500, Cliff Wickman wrote:
> On Fri, May 30, 2008 at 03:36:54PM +0200, Peter Zijlstra wrote:
> > On Thu, 2008-05-29 at 11:32 -0500, Cliff Wickman wrote:
> > > >> I built an ia64 kernel from Andrew's tree (2.6.26-rc2-mm1)
> > > >> and get a very predictable hotplug cpu problem.
> > > >> billberry1:/tmp/cpw # ./dis
> > > >> disabled cpu 17
> > > >> enabled cpu 17
> > > >> billberry1:/tmp/cpw # ./dis
> > > >> disabled cpu 17
> > > >> enabled cpu 17
> > > >> billberry1:/tmp/cpw # ./dis
> > > >> 
> > > >> The script that disables the cpu always hangs (unkillable)
> > > >> on the 3rd attempt.
> > > 
> > > > And a bit further:
> > > > The kstopmachine thread always sits on the run queue (real time) for about
> > > > 30 minutes before running.
> > > 
> > > And a bit further:
> > > 
> > > The kstopmachine thread is queued as real-time on the downed cpu:
> > > >> rq -f 17
> > > CPU#       runq address size Lock     current task         time name
> > > ==========================================================================
> > >   17 0xe000046003059540    3  U 0xe0000360f06f8000            0 swapper
> > >         Total of 3 queued:
> > >         3 real time tasks:  px *(rt_rq *)0xe000046003059608
> > >         exclusive queue:
> > >         slot 0
> > >         0xe0000760f4628000            0 migration/17
> > >         0xe0000760f4708000            0 kstopmachine
> > >         0xe0000760f6678000            0 watchdog/17
> > > 
> > > I put in counters and see that schedule() is never again entered by cpu 17
> > > after it is downed the 3rd time.
> > > (it is entered after being up'd the first two times)
> > > 
> > > The kstopmachine thread is bound to cpu 17 by __stop_machine_run()'s call
> > > to kthread_bind().
> > > 
> > > A cpu does not schedule after being downed, of course.  But it does again
> > > after being up'd.  
> > > Why would the second up be different?  Following it, if the cpu is
> > > downed it never schedules again.
> > > 
> > > If I always bind kstopmachine to cpu 0 the problem disappears.
> > 
> > does:
> > 
> > echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> > 
> > fix the problem?
> 
> Yes! It does. 
> 
> Dimitri Sivanich has run into what looks like a similar problem.
> Hope the above workaround is a good clue to its solution.


Does the below fix it?


Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
 kernel/sched.c    |   15 +++++--
 kernel/sched_rt.c |  109 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 115 insertions(+), 9 deletions(-)

Index: linux-2.6/kernel/sched_rt.c
===================================================================
--- linux-2.6.orig/kernel/sched_rt.c
+++ linux-2.6/kernel/sched_rt.c
@@ -280,6 +280,9 @@ static int balance_runtime(struct rt_rq 
 			continue;
 
 		spin_lock(&iter->rt_runtime_lock);
+		if (iter->rt_runtime == RUNTIME_INF)
+			goto next;
+
 		diff = iter->rt_runtime - iter->rt_time;
 		if (diff > 0) {
 			do_div(diff, weight);
@@ -293,12 +296,105 @@ static int balance_runtime(struct rt_rq 
 				break;
 			}
 		}
+next:
 		spin_unlock(&iter->rt_runtime_lock);
 	}
 	spin_unlock(&rt_b->rt_runtime_lock);
 
 	return more;
 }
+
+static void __disable_runtime(struct rq *rq)
+{
+	struct root_domain *rd = rq->rd;
+	struct rt_rq *rt_rq;
+
+	if (unlikely(!scheduler_running))
+		return;
+
+	for_each_leaf_rt_rq(rt_rq, rq) {
+		struct rt_bandwidth *rt_b = sched_rt_bandwidth(rt_rq);
+		s64 want;
+		int i;
+
+		spin_lock(&rt_b->rt_runtime_lock);
+		spin_lock(&rt_rq->rt_runtime_lock);
+		if (rt_rq->rt_runtime == RUNTIME_INF ||
+				rt_rq->rt_runtime == rt_b->rt_runtime)
+			goto balanced;
+		spin_unlock(&rt_rq->rt_runtime_lock);
+
+		want = rt_b->rt_runtime - rt_rq->rt_runtime;
+
+		for_each_cpu_mask(i, rd->span) {
+			struct rt_rq *iter = sched_rt_period_rt_rq(rt_b, i);
+			s64 diff;
+
+			if (iter == rt_rq)
+				continue;
+
+			spin_lock(&iter->rt_runtime_lock);
+			if (want > 0) {
+				diff = min_t(s64, iter->rt_runtime, want);
+				iter->rt_runtime -= diff;
+				want -= diff;
+			} else {
+				iter->rt_runtime -= want;
+				want -= want;
+			}
+			spin_unlock(&iter->rt_runtime_lock);
+
+			if (!want)
+				break;
+		}
+
+		spin_lock(&rt_rq->rt_runtime_lock);
+		BUG_ON(want);
+balanced:
+		rt_rq->rt_runtime = RUNTIME_INF;
+		spin_unlock(&rt_rq->rt_runtime_lock);
+		spin_unlock(&rt_b->rt_runtime_lock);
+	}
+}
+
+static void disable_runtime(struct rq *rq)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&rq->lock, flags);
+	__disable_runtime(rq);
+	spin_unlock_irqrestore(&rq->lock, flags);
+}
+
+static void __enable_runtime(struct rq *rq)
+{
+	struct root_domain *rd = rq->rd;
+	struct rt_rq *rt_rq;
+
+	if (unlikely(!scheduler_running))
+		return;
+
+	for_each_leaf_rt_rq(rt_rq, rq) {
+		struct rt_bandwidth *rt_b = sched_rt_bandwidth(rt_rq);
+
+		spin_lock(&rt_b->rt_runtime_lock);
+		spin_lock(&rt_rq->rt_runtime_lock);
+		rt_rq->rt_runtime = rt_b->rt_runtime;
+		rt_rq->rt_time = 0;
+		spin_unlock(&rt_rq->rt_runtime_lock);
+		spin_unlock(&rt_b->rt_runtime_lock);
+	}
+}
+
+static void enable_runtime(struct rq *rq)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&rq->lock, flags);
+	__enable_runtime(rq);
+	spin_unlock_irqrestore(&rq->lock, flags);
+}
+
 #endif
 
 static inline int rt_se_prio(struct sched_rt_entity *rt_se)
@@ -328,14 +424,13 @@ static int sched_rt_runtime_exceeded(str
 
 #ifdef CONFIG_SMP
 	if (rt_rq->rt_time > runtime) {
-		int more;
-
 		spin_unlock(&rt_rq->rt_runtime_lock);
-		more = balance_runtime(rt_rq);
+		balance_runtime(rt_rq);
 		spin_lock(&rt_rq->rt_runtime_lock);
 
-		if (more)
-			runtime = sched_rt_runtime(rt_rq);
+		runtime = sched_rt_runtime(rt_rq);
+		if (runtime == RUNTIME_INF)
+			return 0;
 	}
 #endif
 
@@ -1157,6 +1252,8 @@ static void join_domain_rt(struct rq *rq
 {
 	if (rq->rt.overloaded)
 		rt_set_overload(rq);
+
+	__enable_runtime(rq);
 }
 
 /* Assumes rq->lock is held */
@@ -1164,6 +1261,8 @@ static void leave_domain_rt(struct rq *r
 {
 	if (rq->rt.overloaded)
 		rt_clear_overload(rq);
+
+	__disable_runtime(rq);
 }
 
 /*
Index: linux-2.6/kernel/sched.c
===================================================================
--- linux-2.6.orig/kernel/sched.c
+++ linux-2.6/kernel/sched.c
@@ -7455,20 +7455,27 @@ int sched_create_sysfs_power_savings_ent
 static int update_sched_domains(struct notifier_block *nfb,
 				unsigned long action, void *hcpu)
 {
+	int cpu = (int)(long)hcpu;
+
 	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
 	case CPU_DOWN_PREPARE:
 	case CPU_DOWN_PREPARE_FROZEN:
+		disable_runtime(cpu_rq(cpu));
+		/* fall-through */
+	case CPU_UP_PREPARE:
+	case CPU_UP_PREPARE_FROZEN:
 		detach_destroy_domains(&cpu_online_map);
 		return NOTIFY_OK;
 
-	case CPU_UP_CANCELED:
-	case CPU_UP_CANCELED_FROZEN:
+
 	case CPU_DOWN_FAILED:
 	case CPU_DOWN_FAILED_FROZEN:
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
+		enable_runtime(cpu_rq(cpu));
+		/* fall-through */
+	case CPU_UP_CANCELED:
+	case CPU_UP_CANCELED_FROZEN:
 	case CPU_DEAD:
 	case CPU_DEAD_FROZEN:
 		/*



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] hotplug cpus on ia64
  2008-06-05 12:49     ` Peter Zijlstra
@ 2008-06-05 13:51       ` Dimitri Sivanich
  2008-06-05 14:18         ` Peter Zijlstra
  2008-06-10 10:19       ` Ingo Molnar
  1 sibling, 1 reply; 18+ messages in thread
From: Dimitri Sivanich @ 2008-06-05 13:51 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Cliff Wickman, linux-kernel

On Thu, Jun 05, 2008 at 02:49:58PM +0200, Peter Zijlstra wrote:
> 
> Does the below fix it?
> 

I don't believe so:


:~ # taskset -p $$
pid 4502's current affinity mask: 8
:~ # cd /sys/devices/system
:/sys/devices/system # cd cpu
:/sys/devices/system/cpu # cd cpu2
:/sys/devices/system/cpu/cpu2 # cat online
1
:/sys/devices/system/cpu/cpu2 # echo 0 >online
:/sys/devices/system/cpu/cpu2 # taskset -p $$
pid 4502's current affinity mask: 8
:/sys/devices/system/cpu/cpu2 # taskset -cp 0-3 $$
pid 4502's current affinity list: 3
pid 4502's new affinity list: 0,1,3
:/sys/devices/system/cpu/cpu2 # taskset -p $$
pid 4502's current affinity mask: b
:/sys/devices/system/cpu/cpu2 # echo 1 >online
:/sys/devices/system/cpu/cpu2 # taskset -p $$
pid 4502's current affinity mask: b
:/sys/devices/system/cpu/cpu2 # taskset -cp 0-3 $$
pid 4502's current affinity list: 0,1,3
pid 4502's new affinity list: 0-3
:/sys/devices/system/cpu/cpu2 # taskset -p $$
pid 4502's current affinity mask: f
:/sys/devices/system/cpu/cpu2 # echo 0 >online
:/sys/devices/system/cpu/cpu2 # taskset -p $$
pid 4502's current affinity mask: b
:/sys/devices/system/cpu/cpu2 # echo 1 >online
:/sys/devices/system/cpu/cpu2 # taskset -p $$
(above command now hangs)


(ps output)
0xe0000060b5650000     4502     4349  0    2   S  0xe0000060b5650390  bash
0xe0000060b8da0000     4843     4502  0    2   D  0xe0000060b8da0390  bash


Stack traceback for pid 4843
0xe0000060b8da0000     4843     4502  0    2   D  0xe0000060b8da0390  bash
0xa0000001007d44b0 schedule+0x1210
        args (0xe0000060ba470ce4, 0xa000000100dae190, 0xe000006003129200, 0xa000000100084b70, 0x48c, 0xe0000060b8dafda8, 0xe000006003129200, 0x200, 0xe0000060f780fe80)
0xa0000001007d4ac0 schedule_timeout+0x40
        args (0x7fffffffffffffff, 0x0, 0x0, 0xa0000001007d2f00, 0x309, 0xe000006003129200)
0xa0000001007d2f00 wait_for_common+0x240
        args (0xe0000060b8dafe08, 0x7fffffffffffffff, 0x2, 0xa0000001007d3280, 0x207, 0xe0000060ba470070)
0xa0000001007d3280 wait_for_completion+0x40
        args (0xe0000060b8dafe08, 0xa00000010008d990, 0x38a, 0xffffffffffff9200)
0xa00000010008d990 sched_exec+0x1b0
        args (0x2, 0xe0000060ba470000, 0xe0000060ba470010, 0xe000006003129200, 0xa00000010017e980, 0x58e, 0xa00000010017dce0)
0xa00000010017e980 do_execve+0xa0
        args (0xe0000060f39e5000, 0x60000000000394b0, 0x6000000000056150, 0xe0000060b8dafe40, 0xe0000060f799f100, 0xe0000060f799bb00, 0xe0000060f799bbd8, 0x60000000000620b1, 0xa000000100013940)
0xa000000100013940 sys_execve+0x60
        args (0xe0000060f39e5000, 0xe0000060f39e5000, 0x6000000000056150, 0xe0000060b8dafe40, 0xa00000010000a270, 0x50e, 0x2000000000028490)
0xa00000010000a270 ia64_execve+0x30
        args (0x60000000000620a0, 0x60000000000394b0, 0x6000000000056150, 0x0, 0xc00000000000058e, 0x400000000003d020, 0x60000000000394b0, 0x0, 0xa00000010000aba0)
0xa00000010000aba0 ia64_ret_from_syscall
        args (0x60000000000620a0, 0x60000000000394b0, 0x6000000000056150, 0x0, 0xc00000000000058e, 0x400000000003d020, 0x60000000000394b0, 0x0)
0xa000000000010720 __kernel_syscall_via_break
        args (0x60000000000620a0, 0x60000000000394b0, 0x6000000000056150, 0x0, 0xc00000000000058e, 0x400000000003d020, 0x60000000000394b0, 0x0)


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] hotplug cpus on ia64
  2008-06-05 13:51       ` Dimitri Sivanich
@ 2008-06-05 14:18         ` Peter Zijlstra
  0 siblings, 0 replies; 18+ messages in thread
From: Peter Zijlstra @ 2008-06-05 14:18 UTC (permalink / raw)
  To: Dimitri Sivanich; +Cc: Cliff Wickman, linux-kernel

On Thu, 2008-06-05 at 08:51 -0500, Dimitri Sivanich wrote:
> On Thu, Jun 05, 2008 at 02:49:58PM +0200, Peter Zijlstra wrote:
> > 
> > Does the below fix it?
> > 
> 
> I don't believe so:

Humpfh :-( I'll continue looking then...

Thanks for testing.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [BUG] hotplug cpus on ia64
  2008-06-05 12:49     ` Peter Zijlstra
  2008-06-05 13:51       ` Dimitri Sivanich
@ 2008-06-10 10:19       ` Ingo Molnar
  1 sibling, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 2008-06-10 10:19 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Cliff Wickman, sivanich, linux-kernel


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> > > does:
> > > 
> > > echo -1 > /proc/sys/kernel/sched_rt_runtime_us
> > > 
> > > fix the problem?
> > 
> > Yes! It does. 
> > 
> > Dimitri Sivanich has run into what looks like a similar problem.
> > Hope the above workaround is a good clue to its solution.
> 
> Does the below fix it?

while it's not the full fix i've applied it to tip/sched-devel for more 
testing. Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2008-06-10 10:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-13 14:33 [BUG] cpu hotplug vs scheduler Avi Kivity
2008-05-13 15:33 ` Avi Kivity
2008-05-13 19:00   ` Heiko Carstens
2008-05-14  8:13 ` Dmitry Adamushko
2008-05-14 12:30   ` Avi Kivity
2008-05-14 13:05     ` Dmitry Adamushko
2008-05-15 10:19       ` Avi Kivity
2008-05-21 12:31         ` Heiko Carstens
2008-05-21 12:42           ` Avi Kivity
2008-05-21 12:55             ` Heiko Carstens
2008-05-21 13:03               ` Avi Kivity
2008-05-21 14:48   ` [BUG] hotplug cpus on ia64 Cliff Wickman
     [not found] <E1K1l2y-0007bu-44@eag09.americas.sgi.com>
     [not found] ` <1212154614.12349.244.camel@twins>
2008-06-03 22:17   ` Cliff Wickman
2008-06-04 13:50     ` Dimitri Sivanich
2008-06-05 12:49     ` Peter Zijlstra
2008-06-05 13:51       ` Dimitri Sivanich
2008-06-05 14:18         ` Peter Zijlstra
2008-06-10 10:19       ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.