public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* cpu_exclusive feature of cpuset broken?
@ 2006-03-16 15:28 Srivatsa Vaddagiri
  2006-03-16 16:38 ` Paul Jackson
  0 siblings, 1 reply; 4+ messages in thread
From: Srivatsa Vaddagiri @ 2006-03-16 15:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: pj

Hello,
	I was testing cpuset in 2.6.16-rc6 and found that I cant create
exclusive cpusets. As soon as I make a cpuset exclusive, it locks up.
Basically here's what I did to see the lockup:

	# mkdir /dev/cpuset
	# mount -t cpuset none /dev/cpuset
	# cd /dev/cpuset
	# mkdir a
	# /bin/echo 7 > cpus
	# /bin/echo 1 > cpu_exclusive
		<System locks up here>

I saw this problem on two machines (4way x86-64 box and another 8way x86 box)
but didnt see the lockup on another 4way x86 box. I am puzzled. Before
I start digging further, wanted to check if anyone else has seen the
problem.

When the lockup happened on x86-64 box, NMI watchdog caught it and
spit these messages:



llm17:~ # NMI Watchdog detected LOCKUP on CPU 2
CPU 2 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.16-rc6 #5
RIP: 0010:[<ffffffff80128fcb>] <ffffffff80128fcb>{find_busiest_group+316}
RSP: 0018:ffff810237ce3e80  EFLAGS: 00000046
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff810008c8e400 RDI: ffff810008c876e0
RBP: ffff810237ce3ef8 R08: 0000000000000000 R09: ffff810008c876e0
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff80588338
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff810237c772c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002ad5fdb7b190 CR3: 0000000000101000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffff810237cdc000, task ffff810237c99180)
Stack: 0000000000000000 0000000000000000 ffff810008c87660 0000000000000000 
       ffff810237ce3f38 ffff810237ce3f30 0000000230e97d08 ffffffff80588320 
       0000000129da210c 0000000000000000 
Call Trace: <IRQ> <ffffffff8012c2af>{rebalance_tick+340}
       <ffffffff80139bec>{update_process_times+92} <ffffffff80118567>{smp_local_timer_interrupt+35}
       <ffffffff80118bfd>{smp_apic_timer_interrupt+65} <ffffffff801098b6>{mwait_idle+0}
       <ffffffff8010b382>{apic_timer_interrupt+98} <EOI> <ffffffff801098ec>{mwait_idle+54}
       <ffffffff80109893>{cpu_idle+151} <ffffffff80118383>{start_secondary+1120}

Code: 48 0f 46 d1 49 01 d5 49 8b 54 24 08 8d 4b 01 48 d3 ea b9 20 
console shuts up ...
 NMI Watchdog detected LOCKUP on CPU 0
CPU 0 
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.16-rc6 #5
RIP: 0010:[<ffffffff80128fee>] <ffffffff80128fee>{find_busiest_group+351}
RSP: 0018:ffffffff80572e80  EFLAGS: 00000046
RAX: 000000000000001c RBX: 0000000000000003 RCX: 0000000000000020
RDX: 0000000000000000 RSI: ffff810008c8e400 RDI: ffff810008c776e0
RBP: ffffffff80572ef8 R08: 0000000000000000 R09: ffff810008c776e0
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff80588338
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffffffff805fc000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002b8140ddf000 CR3: 000000022ea52000 CR4: 00000000000006e0
Process swapper (pid: 0, threadinfo ffffffff8060e000, task ffffffff8048a340)
Stack: 0000000000000000 0000000000000000 ffff810008c77660 0000000000000000 
       ffffffff80572f38 ffffffff80572f30 0000000080139648 ffffffff80588320 
       0000000124ae798c 0000000000000000 
Call Trace: <IRQ> <ffffffff8012c2af>{rebalance_tick+340}
       <ffffffff80139bec>{update_process_times+92} <ffffffff80118567>{smp_local_timer_interrupt+35}
       <ffffffff80118bfd>{smp_apic_timer_interrupt+65} <ffffffff801098b6>{mwait_idle+0}
       <ffffffff8010b382>{apic_timer_interrupt+98} <EOI> <ffffffff801098ec>{mwait_idle+54}
       <ffffffff80109893>{cpu_idle+151} <ffffffff8061076c>{start_kernel+465}
       <ffffffff80610296>{_sinittext+662}

Code: 48 0f 44 f0 8d 5c 33 01 83 fb 20 0f 4f d9 83 fb 1f 0f 8e 18 
console shuts up ...
 <0>Kernel panic - not syncing: Aiee, killing interrupt handler!


-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cpu_exclusive feature of cpuset broken?
  2006-03-16 15:28 cpu_exclusive feature of cpuset broken? Srivatsa Vaddagiri
@ 2006-03-16 16:38 ` Paul Jackson
  2006-03-16 17:01   ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 4+ messages in thread
From: Paul Jackson @ 2006-03-16 16:38 UTC (permalink / raw)
  To: vatsa; +Cc: linux-kernel

Srivatsa wrote:
	# cd /dev/cpuset
	# mkdir a
	# /bin/echo 7 > cpus
	# /bin/echo 1 > cpu_exclusive

I have not seen anything resembling such a lockup.

However you are doing something odd here.

While you created a subcpuset 'a', you changed the
cpus and cpu_exclusive in the root cpuset.  This
changed -all- tasks to only be allowed to run on
cpu 7.

I'd guess you have some kernel thread or such that
really, really wants to run on some other cpu.

When I read you transcript, I expected it to say:

	# mkdir /dev/cpuset
	# mount -t cpuset cpuset /dev/cpuset	# s/none/cpuset/ - clearer
	# cd /dev/cpuset
	# mkdir a
	# cd a					# the missing step
	# /bin/echo 7 > cpus
	# /bin/echo 1 > cpu_exclusive

The s/none/cpuset/ in the mount command is just a nit.
That field shows up in various mount command error
messages, and 'cpuset' is alot clearer than 'none' in
such messages.

When I do your commands (without the 'cd a'), I don't
see any problem or hang on my Altix test box.  But that
probably just means I am not critically depending at that
moment on some kernel thread running on any particular cpu.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cpu_exclusive feature of cpuset broken?
  2006-03-16 16:38 ` Paul Jackson
@ 2006-03-16 17:01   ` Srivatsa Vaddagiri
  2006-03-16 17:31     ` Paul Jackson
  0 siblings, 1 reply; 4+ messages in thread
From: Srivatsa Vaddagiri @ 2006-03-16 17:01 UTC (permalink / raw)
  To: Paul Jackson; +Cc: linux-kernel

On Thu, Mar 16, 2006 at 08:38:36AM -0800, Paul Jackson wrote:
> While you created a subcpuset 'a', you changed the
> cpus and cpu_exclusive in the root cpuset.  This
> changed -all- tasks to only be allowed to run on
> cpu 7.

oops sorry  ..i did mean to say that i was trying to change
cpu-exclusive flag of a (and not the root cpuset).

> 
> I'd guess you have some kernel thread or such that
> really, really wants to run on some other cpu.
> 
> When I read you transcript, I expected it to say:
> 
> 	# mkdir /dev/cpuset
> 	# mount -t cpuset cpuset /dev/cpuset	# s/none/cpuset/ - clearer
> 	# cd /dev/cpuset
> 	# mkdir a
> 	# cd a					# the missing step

Yes ..you are right. I did run 'cd a' and then ran the below commands.

> 	# /bin/echo 7 > cpus
> 	# /bin/echo 1 > cpu_exclusive

> When I do your commands (without the 'cd a'), I don't
> see any problem or hang on my Altix test box.  But that
> probably just means I am not critically depending at that
> moment on some kernel thread running on any particular cpu.

Did you try with the 'cd a' (in other words turn on exclusive 
property of cpuset 'a')?

-- 
Regards,
vatsa

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: cpu_exclusive feature of cpuset broken?
  2006-03-16 17:01   ` Srivatsa Vaddagiri
@ 2006-03-16 17:31     ` Paul Jackson
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Jackson @ 2006-03-16 17:31 UTC (permalink / raw)
  To: vatsa; +Cc: linux-kernel

> Did you try with the 'cd a' (in other words turn on exclusive 
> property of cpuset 'a')?

Yes - that works fine.  I've never seen any problem like this.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@sgi.com> 1.925.600.0401

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-03-16 17:32 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-16 15:28 cpu_exclusive feature of cpuset broken? Srivatsa Vaddagiri
2006-03-16 16:38 ` Paul Jackson
2006-03-16 17:01   ` Srivatsa Vaddagiri
2006-03-16 17:31     ` Paul Jackson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox