* 2.6.12-rc6-mm1
@ 2005-06-07 23:50 Martin J. Bligh
2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton
0 siblings, 1 reply; 72+ messages in thread
From: Martin J. Bligh @ 2005-06-07 23:50 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel
Wheeee! it actually compiles and boots for me on x86 ;-)
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png
Seems to show that perf is rather sucky on kernbench though.
baseline (-rc6) data is here:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4760/kernbench.test/
-mm1 is here:
http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4876/kernbench.test/
Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm).
If I factor it by 4x, I get:
47796 10.9% total
16644 30.5% buffered_rmqueue
15574 7.7% default_idle
2229 239.4% kmem_cache_free
1782 11.1% zap_pte_range
1752 0.0% inotify_inode_queue_event
1467 36.3% release_pages
1281 73.3% set_page_dirty
1155 12.8% do_wp_page
924 8.3% _spin_lock
896 0.0% find_idlest_group
828 21.7% free_hot_cold_page
780 0.0% drain_remote_pages
772 0.0% dput_recursive
464 0.0% inotify_dentry_parent_queue_event
...
-412 -8.1% __d_lookup
-508 -98.4% find_idlest_cpu
-542 -24.5% do_anonymous_page
-549 -47.5% current_fs_time
-580 -100.0% del_timer_sync
-594 -86.6% dput
-695 -31.4% __copy_user_intel
-1461 -13.9% strnlen_user
Buggered if I know what that is from. I'm guessing scheduler, or the
HZ change. I guess I can rerun with the HZ set to 1000 ... you got any
experimental scheduler stuff in your tree?
Else I guess it's some memory allocator stuff maybe?
^ permalink raw reply [flat|nested] 72+ messages in thread* Re: 2.6.12-rc6-mm1 2005-06-07 23:50 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-07 23:56 ` Andrew Morton 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 2 replies; 72+ messages in thread From: Andrew Morton @ 2005-06-07 23:56 UTC (permalink / raw) To: Martin J. Bligh; +Cc: linux-kernel, Christoph Lameter "Martin J. Bligh" <mbligh@mbligh.org> wrote: > > Wheeee! it actually compiles and boots for me on x86 ;-) We aim to please. > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png > > Seems to show that perf is rather sucky on kernbench though. CPU scheduler. > baseline (-rc6) data is here: > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4760/kernbench.test/ > > -mm1 is here: > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/4876/kernbench.test/ > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). Oh crap, so it does. That's wrong. > If I factor it by 4x, I get: Would it be possible to set it back to 100Hz, retest? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-08 0:02 ` Christoph Lameter 2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-09 1:58 ` 2.6.12-rc6-mm1 Lee Revell 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Martin J. Bligh 1 sibling, 2 replies; 72+ messages in thread From: Christoph Lameter @ 2005-06-08 0:02 UTC (permalink / raw) To: Andrew Morton; +Cc: Martin J. Bligh, linux-kernel On Tue, 7 Jun 2005, Andrew Morton wrote: > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). > > Oh crap, so it does. That's wrong. Email by you and Linus indicated that 250 should be the default. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter @ 2005-06-08 0:08 ` Andrew Morton 2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin ` (2 more replies) 2005-06-09 1:58 ` 2.6.12-rc6-mm1 Lee Revell 1 sibling, 3 replies; 72+ messages in thread From: Andrew Morton @ 2005-06-08 0:08 UTC (permalink / raw) To: Christoph Lameter; +Cc: mbligh, linux-kernel Christoph Lameter <clameter@engr.sgi.com> wrote: > > On Tue, 7 Jun 2005, Andrew Morton wrote: > > > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). > > > > Oh crap, so it does. That's wrong. > > Email by you and Linus indicated that 250 should be the default. Oh, OK. hrm. Martin, it would be useful if you could determine whether the kernbench slowdown was due to the 1000Hz->250Hz change, thanks. I'm assuming it was the CPU scheduler patches. There are 36 of them ;) ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-08 3:17 ` Nick Piggin 2005-06-08 3:33 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-08 14:15 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh 2 siblings, 1 reply; 72+ messages in thread From: Nick Piggin @ 2005-06-08 3:17 UTC (permalink / raw) To: Andrew Morton; +Cc: Christoph Lameter, mbligh, lkml, Con Kolivas On Tue, 2005-06-07 at 17:08 -0700, Andrew Morton wrote: > Christoph Lameter <clameter@engr.sgi.com> wrote: > > > > On Tue, 7 Jun 2005, Andrew Morton wrote: > > > > > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). > > > > > > Oh crap, so it does. That's wrong. > > > > Email by you and Linus indicated that 250 should be the default. > > Oh, OK. hrm. > > Martin, it would be useful if you could determine whether the kernbench > slowdown was due to the 1000Hz->250Hz change, thanks. > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;) I'm looking at some issues with the scheduler patches. To start with, it looks like the smp-nice patches are broken. Even if they weren't I think it might be a good idea just to put them on hold until we work out what to do with the other sched patches... we're only just starting to get some interesting tests (ie. regressions) being run on -mm (at least that I've been made aware of). So give me a bit of time to work though that. Anyway, Con, this is what it is doing on a 64-way Altix running aim7: (compare imbalances, task move rates, wakeup move rates, etc). --- wakeup statistics --- 269.174 task wakes / s 31.704% of them from the local CPU 14.190% of remote wakeups come from domain0 0.000% are moved to the local CPU via passive load balancing 26.660% are moved to the local CPU via affine wakeups 46.672% of remote wakeups come from domain1 10.359% are moved to the local CPU via passive load balancing 0.000% are moved to the local CPU via affine wakeups 39.012% of remote wakeups come from domain2 10.659% are moved to the local CPU via passive load balancing 0.000% are moved to the local CPU via affine wakeups --- load balancing statistics --- for domain0 4368.652 load balance calls / s move 137.083 tasks / s 96.456% calls and 1.174% task moves came from idle balancing 0.042% were imbalanced with an average imbalance of 566.708 0.038% found an imbalance but failed 6.165% of tasks moved were cache hot 1.818% calls and 73.086% task moves came from busy balancing 47.694% were imbalanced with an average imbalance of 335.932 4.704% found an imbalance but failed 0.140% of tasks moved were cache hot 1.726% calls and 25.740% task moves came from new-idle balancing 26.938% were imbalanced with an average imbalance of 198.054 9.136% found an imbalance but failed 0.151% of tasks moved were cache hot 0.000 active balances / s move 0.000 tasks / s 0.000 exec balances / s move 0.000 tasks / s 0.000 fork balances / s move 0.000 tasks / s for domain1 102.002 load balance calls / s move 180.344 tasks / s 85.398% calls and 17.496% task moves came from idle balancing 5.920% were imbalanced with an average imbalance of 386.172 2.103% found an imbalance but failed 0.920% of tasks moved were cache hot 14.602% calls and 82.504% task moves came from busy balancing 69.017% were imbalanced with an average imbalance of 702.928 5.849% found an imbalance but failed 0.075% of tasks moved were cache hot 0.000% calls and 0.000% task moves came from new-idle balancing 0.048 active balances / s move 0.002 tasks / s %95.000 attempts failed 0.000 exec balances / s move 0.000 tasks / s 0.000 fork balances / s move 0.000 tasks / s for domain2 9.496 load balance calls / s move 13.070 tasks / s 91.335% calls and 32.327% task moves came from idle balancing 21.094% were imbalanced with an average imbalance of 115.513 16.936% found an imbalance but failed 2.978% of tasks moved were cache hot 8.665% calls and 67.673% task moves came from busy balancing 64.118% were imbalanced with an average imbalance of 503.867 17.353% found an imbalance but failed 0.383% of tasks moved were cache hot 0.000% calls and 0.000% task moves came from new-idle balancing 0.007 active balances / s move 0.007 tasks / s %0.000 attempts failed 0.000 exec balances / s move 0.000 tasks / s 0.000 fork balances / s move 0.000 tasks / s And this is what it looks like with smpnice #if'ed out: --- wakeup statistics --- 331.734 task wakes / s 25.492% of them from the local CPU 13.601% of remote wakeups come from domain0 0.000% are moved to the local CPU via passive load balancing 1.674% are moved to the local CPU via affine wakeups 44.484% of remote wakeups come from domain1 3.139% are moved to the local CPU via passive load balancing 0.000% are moved to the local CPU via affine wakeups 42.088% of remote wakeups come from domain2 0.000% are moved to the local CPU via passive load balancing 0.000% are moved to the local CPU via affine wakeups --- load balancing statistics --- for domain0 3940.070 load balance calls / s move 3.671 tasks / s 96.488% calls and 48.889% task moves came from idle balancing 0.068% were imbalanced with an average imbalance of 1.132 0.029% found an imbalance but failed 3.135% of tasks moved were cache hot 1.339% calls and 33.563% task moves came from busy balancing 2.319% were imbalanced with an average imbalance of 1.037 0.069% found an imbalance but failed 0.228% of tasks moved were cache hot 2.173% calls and 17.548% task moves came from new-idle balancing 1.259% were imbalanced with an average imbalance of 1.008 0.516% found an imbalance but failed 3.057% of tasks moved were cache hot 0.006 active balances / s move 0.006 tasks / s %0.000 attempts failed 0.000 exec balances / s move 0.000 tasks / s 0.000 fork balances / s move 0.000 tasks / s for domain1 86.378 load balance calls / s move 2.644 tasks / s 94.236% calls and 89.468% task moves came from idle balancing 4.116% were imbalanced with an average imbalance of 1.123 1.597% found an imbalance but failed 4.281% of tasks moved were cache hot 5.764% calls and 10.532% task moves came from busy balancing 6.667% were imbalanced with an average imbalance of 1.008 1.130% found an imbalance but failed 0.000% of tasks moved were cache hot 0.000% calls and 0.000% task moves came from new-idle balancing 0.082 active balances / s move 0.017 tasks / s %79.310 attempts failed 0.000 exec balances / s move 0.000 tasks / s 0.000 fork balances / s move 0.000 tasks / s for domain2 9.024 load balance calls / s move 0.343 tasks / s 95.293% calls and 88.525% task moves came from idle balancing 12.103% were imbalanced with an average imbalance of 1.003 8.701% found an imbalance but failed 14.815% of tasks moved were cache hot 4.707% calls and 11.475% task moves came from busy balancing 16.556% were imbalanced with an average imbalance of 1.000 7.285% found an imbalance but failed 21.429% of tasks moved were cache hot 0.000% calls and 0.000% task moves came from new-idle balancing 0.008 active balances / s move 0.008 tasks / s %0.000 attempts failed 0.000 exec balances / s move 0.000 tasks / s 0.000 fork balances / s move 0.000 tasks / s -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin @ 2005-06-08 3:33 ` Con Kolivas 2005-06-08 3:50 ` 2.6.12-rc6-mm1 Nick Piggin 0 siblings, 1 reply; 72+ messages in thread From: Con Kolivas @ 2005-06-08 3:33 UTC (permalink / raw) To: Nick Piggin; +Cc: Andrew Morton, Christoph Lameter, mbligh, lkml On Wed, 8 Jun 2005 01:17 pm, Nick Piggin wrote: > On Tue, 2005-06-07 at 17:08 -0700, Andrew Morton wrote: > > Christoph Lameter <clameter@engr.sgi.com> wrote: > > > On Tue, 7 Jun 2005, Andrew Morton wrote: > > > > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). > > > > > > > > Oh crap, so it does. That's wrong. > > > > > > Email by you and Linus indicated that 250 should be the default. > > > > Oh, OK. hrm. > > > > Martin, it would be useful if you could determine whether the kernbench > > slowdown was due to the 1000Hz->250Hz change, thanks. > > > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;) > > I'm looking at some issues with the scheduler patches. > > To start with, it looks like the smp-nice patches are broken. Even if > they weren't I think it might be a good idea just to put them on hold > until we work out what to do with the other sched patches... I originally said I'd wait till the sched patches settled down before tackling it but it didn't look like that was ever going to happen and broken nice on SMP is a real bug biting people now so I figured I should just tackle it anyway. I don't mind if we just work on it later though. > Anyway, Con, this is what it is doing on a 64-way Altix running aim7: > (compare imbalances, task move rates, wakeup move rates, etc). Definitely different I agree. As for the performance impact the statistics alone don't tell us if they're for good or evil, but we can look at it again separately when we tackle smp nice again. It is a real issue for users now, though so it would be good if we can have a calmer period in the future to do this (smp nice) by itself. These are the four patches Andrew: sched-implement-nice-support-across-physical-cpus-on-smp.patch sched-change_prio_bias_only_if_queued.patch sched-account_rt_tasks_in_prio_bias.patch sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch The other HT patch by me is separate and a bugfix so please leave that in. Cheers, Con ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 3:33 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-08 3:50 ` Nick Piggin 0 siblings, 0 replies; 72+ messages in thread From: Nick Piggin @ 2005-06-08 3:50 UTC (permalink / raw) To: Con Kolivas; +Cc: Andrew Morton, Christoph Lameter, mbligh, lkml On Wed, 2005-06-08 at 13:33 +1000, Con Kolivas wrote: > On Wed, 8 Jun 2005 01:17 pm, Nick Piggin wrote: > > To start with, it looks like the smp-nice patches are broken. Even if > > they weren't I think it might be a good idea just to put them on hold > > until we work out what to do with the other sched patches... > > I originally said I'd wait till the sched patches settled down before tackling > it but it didn't look like that was ever going to happen and broken nice on > SMP is a real bug biting people now so I figured I should just tackle it > anyway. I don't mind if we just work on it later though. > Well I agree with you that it would be nice to fix it. I think your approach has good potential, and it is along the same lines as what I had in mind. > > Anyway, Con, this is what it is doing on a 64-way Altix running aim7: > > (compare imbalances, task move rates, wakeup move rates, etc). > > Definitely different I agree. As for the performance impact the statistics > alone don't tell us if they're for good or evil, but we can look at it again > separately when we tackle smp nice again. It is a real issue for users now, > though so it would be good if we can have a calmer period in the future to do > this (smp nice) by itself. > True. Fortunately this seems to only come up once a year or so. Although I guess with the rise and rise of multi threaded and multi cored CPUs it could become a bigger issue. > These are the four patches Andrew: > sched-implement-nice-support-across-physical-cpus-on-smp.patch > sched-change_prio_bias_only_if_queued.patch > sched-account_rt_tasks_in_prio_bias.patch > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch > Thanks. > The other HT patch by me is separate and a bugfix so please leave that in. > Yep. Nick -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin @ 2005-06-08 14:15 ` Martin J. Bligh 2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh 2 siblings, 0 replies; 72+ messages in thread From: Martin J. Bligh @ 2005-06-08 14:15 UTC (permalink / raw) To: Andrew Morton, Christoph Lameter; +Cc: linux-kernel --Andrew Morton <akpm@osdl.org> wrote (on Tuesday, June 07, 2005 17:08:53 -0700): > Christoph Lameter <clameter@engr.sgi.com> wrote: >> >> On Tue, 7 Jun 2005, Andrew Morton wrote: >> >> > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). >> > >> > Oh crap, so it does. That's wrong. >> >> Email by you and Linus indicated that 250 should be the default. > > Oh, OK. hrm. > > Martin, it would be useful if you could determine whether the kernbench > slowdown was due to the 1000Hz->250Hz change, thanks. > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;) Is actually worse with HZ=1000 ... so I think we still have another problem, probably with scheduler patches. (the one marked -mm1+p4947 in blue is the patched one) http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png I can back out various patches ... are all the scheduler patches starting in sched.* or something equally obvious? if not, a list of what to blat would help me ... or I'll do a crapshoot, and see what falls out ;-) M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin 2005-06-08 14:15 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-09 23:56 ` Martin J. Bligh 2005-06-10 7:02 ` 2.6.12-rc6-mm1 Ingo Molnar 2 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-09 23:56 UTC (permalink / raw) To: Andrew Morton, Christoph Lameter; +Cc: linux-kernel --On Tuesday, June 07, 2005 17:08:53 -0700 Andrew Morton <akpm@osdl.org> wrote: > Christoph Lameter <clameter@engr.sgi.com> wrote: >> >> On Tue, 7 Jun 2005, Andrew Morton wrote: >> >> > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). >> > >> > Oh crap, so it does. That's wrong. >> >> Email by you and Linus indicated that 250 should be the default. > > Oh, OK. hrm. > > Martin, it would be useful if you could determine whether the kernbench > slowdown was due to the 1000Hz->250Hz change, thanks. > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;) Backed them all out ... performance thunks down to earth again, and is actually the best I've seen it ever (probably 250Hz is helping, I used to run 100 in -mjb for better benefit). the +5081 item is the one to look at http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png Patch I used was here: http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/patches/nosched But it was just everything under the "CPU scheduler" section of your series file. M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-10 7:02 ` Ingo Molnar 2005-06-10 12:03 ` 2.6.12-rc6-mm1 Con Kolivas 0 siblings, 1 reply; 72+ messages in thread From: Ingo Molnar @ 2005-06-10 7:02 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Andrew Morton, Christoph Lameter, linux-kernel * Martin J. Bligh <mbligh@mbligh.org> wrote: > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;) > > Backed them all out ... performance thunks down to earth again, and is > actually the best I've seen it ever (probably 250Hz is helping, I used > to run 100 in -mjb for better benefit). > > the +5081 item is the one to look at > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png > > Patch I used was here: > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/patches/nosched > > But it was just everything under the "CPU scheduler" section of your > series file. we know from Nick's testing that the patches up to and including dynamic-sched-domains-ia64-changes.patch are probably OK. So the candidates for the regression are: sched-implement-nice-support-across-physical-cpus-on-smp.patch sched-change_prio_bias_only_if_queued.patch sched-account_rt_tasks_in_prio_bias.patch consolidate-preempt-options-into-kernel-kconfigpreempt.patch enable-preempt_bkl-on-preemptsmp-too.patch sched-tweak-idle-thread-setup-semantics.patch sched-voluntary-kernel-preemption.patch sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch sched-task_noninteractive.patch sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch there are two feature patches in this: enable-preempt_bkl-on-preemptsmp-too.patch sched-voluntary-kernel-preemption.patch so make sure you have PREEMPT_BKL and PREEMPT_VOLUNTARY disabled. these ones should not impact your workload's functionality (unless they are buggy): sched-account_rt_tasks_in_prio_bias.patch consolidate-preempt-options-into-kernel-kconfigpreempt.patch sched-tweak-idle-thread-setup-semantics.patch sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch and unless you are using separate nice levels, this one shouldnt make a difference in theory: sched-implement-nice-support-across-physical-cpus-on-smp.patch which leaves the following 3 likely candidates: sched-change_prio_bias_only_if_queued.patch sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch sched-task_noninteractive.patch so if you could do a run with all 3 of the above unapplied, that would be a good starting point. (But any of the others might be it too, if they contain some sort of bug.) Ingo ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 7:02 ` 2.6.12-rc6-mm1 Ingo Molnar @ 2005-06-10 12:03 ` Con Kolivas 2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas 0 siblings, 1 reply; 72+ messages in thread From: Con Kolivas @ 2005-06-10 12:03 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter On Fri, 10 Jun 2005 17:02, Ingo Molnar wrote: > * Martin J. Bligh <mbligh@mbligh.org> wrote: > > > I'm assuming it was the CPU scheduler patches. There are 36 of them ;) > > > > Backed them all out ... performance thunks down to earth again, and is > > actually the best I've seen it ever (probably 250Hz is helping, I used > > to run 100 in -mjb for better benefit). > > > > the +5081 item is the one to look at > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench. > >moe.png > > > > Patch I used was here: > > > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/patches/nosched > > > > But it was just everything under the "CPU scheduler" section of your > > series file. > > we know from Nick's testing that the patches up to and including > dynamic-sched-domains-ia64-changes.patch are probably OK. So the > candidates for the regression are: > > sched-implement-nice-support-across-physical-cpus-on-smp.patch > sched-change_prio_bias_only_if_queued.patch > sched-account_rt_tasks_in_prio_bias.patch > consolidate-preempt-options-into-kernel-kconfigpreempt.patch > enable-preempt_bkl-on-preemptsmp-too.patch > sched-tweak-idle-thread-setup-semantics.patch > sched-voluntary-kernel-preemption.patch > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch > sched-task_noninteractive.patch > sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch > > there are two feature patches in this: > > enable-preempt_bkl-on-preemptsmp-too.patch > sched-voluntary-kernel-preemption.patch > > so make sure you have PREEMPT_BKL and PREEMPT_VOLUNTARY disabled. > > these ones should not impact your workload's functionality (unless they > are buggy): > > sched-account_rt_tasks_in_prio_bias.patch > consolidate-preempt-options-into-kernel-kconfigpreempt.patch > sched-tweak-idle-thread-setup-semantics.patch > sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch > > and unless you are using separate nice levels, this one shouldnt make a > difference in theory: > > sched-implement-nice-support-across-physical-cpus-on-smp.patch > > which leaves the following 3 likely candidates: > > sched-change_prio_bias_only_if_queued.patch > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch These tend to run together so just try adding my four patches together. In retrospect I guess they're likely candidates because they also change the _ratio_ of balance which they should not so they are buggy as a group currently. Easy enough to fix but it will make it easy to pinpoint the problem if they're responsible. sched-implement-nice-support-across-physical-cpus-on-smp.patch sched-change_prio_bias_only_if_queued.patch sched-account_rt_tasks_in_prio_bias.patch sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch Con ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 12:03 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-10 14:19 ` Con Kolivas 2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon 2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 2 replies; 72+ messages in thread From: Con Kolivas @ 2005-06-10 14:19 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin [-- Attachment #1.1: Type: text/plain, Size: 2194 bytes --] On Fri, 10 Jun 2005 22:03, Con Kolivas wrote: > On Fri, 10 Jun 2005 17:02, Ingo Molnar wrote: > > * Martin J. Bligh <mbligh@mbligh.org> wrote: > > > > I'm assuming it was the CPU scheduler patches. There are 36 of them > > > > ;) > > So the > > candidates for the regression are: > > > > sched-implement-nice-support-across-physical-cpus-on-smp.patch > > sched-change_prio_bias_only_if_queued.patch > > sched-account_rt_tasks_in_prio_bias.patch > > consolidate-preempt-options-into-kernel-kconfigpreempt.patch > > enable-preempt_bkl-on-preemptsmp-too.patch > > sched-tweak-idle-thread-setup-semantics.patch > > sched-voluntary-kernel-preemption.patch > > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch > > sched-task_noninteractive.patch > > sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch > These tend to run together so just try adding my four patches together. In > retrospect I guess they're likely candidates because they also change the > _ratio_ of balance which they should not so they are buggy as a group > currently. Easy enough to fix but it will make it easy to pinpoint the > problem if they're responsible. > > sched-implement-nice-support-across-physical-cpus-on-smp.patch > sched-change_prio_bias_only_if_queued.patch > sched-account_rt_tasks_in_prio_bias.patch > sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch By the way it has already been decided to remove these patches from -mm pending the completion of current scheduler work. If they turn out to be responsible for this regression I apologise profusely :-|. It is clearer to me now that I have made a mistake with the priority biasing, and the following patch corrects it to the planned behaviour. This is academic at this stage as we won't be looking at this particular feature again in earnest until the other 32 scheduler patches (and any followups) go upstream. It's already known that schedstats data will be off without further code to understand smp nice as well (thanks Nick for pointing out the data)... more academic stuff but obviously something to consider when/if we get there. Cheers, Con [-- Attachment #1.2: sched-correct_smp_nice_bias.patch --] [-- Type: text/x-diff, Size: 1621 bytes --] The priority biasing was off by mutliplying the total load by the total priority bias and this ruins the ratio of loads between runqueues. This patch should correct the ratios of loads between runqueues to be proportional to overall load. Signed-off-by: Con Kolivas <kernel@kolivas.org> Index: linux-2.6.12-rc6-mm1/kernel/sched.c =================================================================== --- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000 +++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-10 23:59:57.000000000 +1000 @@ -978,7 +978,7 @@ static inline unsigned long __source_loa else source_load = min(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) + if (idle == NOT_IDLE || rq->nr_running > 1) { /* * If we are busy rebalancing the load is biased by * priority to create 'nice' support across cpus. When @@ -987,7 +987,10 @@ static inline unsigned long __source_loa * prevent idle rebalance from trying to pull tasks from a * queue with only one running task. */ - source_load *= rq->prio_bias; + unsigned long prio_bias = rq->prio_bias / rq->nr_running; + + source_load *= prio_bias; + } return source_load; } @@ -1011,8 +1014,11 @@ static inline unsigned long __target_loa else target_load = max(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) - target_load *= rq->prio_bias; + if (idle == NOT_IDLE || rq->nr_running > 1) { + unsigned long prio_bias = rq->prio_bias / rq->nr_running; + + target_load *= prio_bias; + } return target_load; } [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-10 23:14 ` J.A. Magallon 2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh 1 sibling, 1 reply; 72+ messages in thread From: J.A. Magallon @ 2005-06-10 23:14 UTC (permalink / raw) To: Con Kolivas Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin On 06.10, Con Kolivas wrote: > The priority biasing was off by mutliplying the total load by the total > priority bias and this ruins the ratio of loads between runqueues. This > patch should correct the ratios of loads between runqueues to be proportional > to overall load. > 2.6.12-rc6-mm1 + this patch just oopses nicely on boot. I did not had a digital camera handy, but the first oops that fit in the screen was this call chain: kernel_thread_helper init init do:base_setup usermodehelper_init __create_workqueue EIP in try_to_wake_up After this, there was another with some do_div_error calls... Something looks un-initialized the first time, or the integer arithmetic is wrong. I really dont like a*(b/c), I really prefer (a*b)/c. It is more common b/c == 0 (because b<c), than the possibility of overflowing (a*b). So I tried both. With this, it boots again: --- linux-2.6.11-jam24/kernel/sched.c.orig 2005-06-11 00:59:44.000000000 +0200 +++ linux-2.6.11-jam24/kernel/sched.c 2005-06-11 01:03:32.000000000 +0200 @@ -987,9 +987,10 @@ * prevent idle rebalance from trying to pull tasks from a * queue with only one running task. */ - unsigned long prio_bias = rq->prio_bias / rq->nr_running; + unsigned long prio_scale = (rq->nr_running > 0 ? + rq->nr_running : 1); - source_load *= prio_bias; + source_load = (source_load*rq->prio_bias) / prio_scale; } return source_load; @@ -1015,9 +1016,10 @@ target_load = max(cpu_load, load_now); if (idle == NOT_IDLE || rq->nr_running > 1) { - unsigned long prio_bias = rq->prio_bias / rq->nr_running; + unsigned long prio_scale = (rq->nr_running > 0 ? + rq->nr_running : 1); - target_load *= prio_bias; + target_load = (target_load*rq->prio_bias) / prio_scale; } return target_load; Perhaps this: if (idle == NOT_IDLE || rq->nr_running > 1) should be if (idle == NOT_IDLE && rq->nr_running > 1) ??? Hope this helps, thanks. > Signed-off-by: Con Kolivas <kernel@kolivas.org> > > Index: linux-2.6.12-rc6-mm1/kernel/sched.c > =================================================================== > --- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000 > +++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-10 23:59:57.000000000 +1000 > @@ -978,7 +978,7 @@ static inline unsigned long __source_loa > else > source_load = min(cpu_load, load_now); > > - if (idle == NOT_IDLE || rq->nr_running > 1) > + if (idle == NOT_IDLE || rq->nr_running > 1) { > /* > * If we are busy rebalancing the load is biased by > * priority to create 'nice' support across cpus. When > @@ -987,7 +987,10 @@ static inline unsigned long __source_loa > * prevent idle rebalance from trying to pull tasks from a > * queue with only one running task. > */ > - source_load *= rq->prio_bias; > + unsigned long prio_bias = rq->prio_bias / rq->nr_running; > + > + source_load *= prio_bias; > + } > > return source_load; > } > @@ -1011,8 +1014,11 @@ static inline unsigned long __target_loa > else > target_load = max(cpu_load, load_now); > > - if (idle == NOT_IDLE || rq->nr_running > 1) > - target_load *= rq->prio_bias; > + if (idle == NOT_IDLE || rq->nr_running > 1) { > + unsigned long prio_bias = rq->prio_bias / rq->nr_running; > + > + target_load *= prio_bias; > + } > > return target_load; > } > -- J.A. Magallon <jamagallon()able!es> \ Software is like sex: werewolf!able!es \ It's better when it's free Mandriva Linux release 2006.0 (Cooker) for i586 Linux 2.6.11-jam24 (gcc 4.0.0 (4.0.0-3mdk for Mandriva Linux release 2006.0)) ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon @ 2005-06-10 23:59 ` Con Kolivas 2005-06-11 0:18 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon 0 siblings, 2 replies; 72+ messages in thread From: Con Kolivas @ 2005-06-10 23:59 UTC (permalink / raw) To: J.A. Magallon Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin [-- Attachment #1.1: Type: text/plain, Size: 1816 bytes --] On Sat, 11 Jun 2005 09:14, J.A. Magallon wrote: > On 06.10, Con Kolivas wrote: > > The priority biasing was off by mutliplying the total load by the total > > priority bias and this ruins the ratio of loads between runqueues. This > > patch should correct the ratios of loads between runqueues to be > > proportional to overall load. > > 2.6.12-rc6-mm1 + this patch just oopses nicely on boot. > I did not had a digital camera handy, but the first oops that fit in the > screen was this call chain: > > kernel_thread_helper > init > init > do:base_setup > usermodehelper_init > __create_workqueue > EIP in try_to_wake_up > > After this, there was another with some do_div_error calls... > > Something looks un-initialized the first time, or the integer arithmetic > is wrong. I really dont like a*(b/c), I really prefer (a*b)/c. It is more > common b/c == 0 (because b<c), than the possibility of overflowing (a*b). > > So I tried both. With this, it boots again: Doh Doh DOH DOH! I need a real swift kick up the bum. The point of the patch was to show what was wrong with the math, and I shouldn't have posted it without actually trying it. > - unsigned long prio_bias = rq->prio_bias / rq->nr_running; rq->nr_running can often be 0 and rq->prio_bias by definition has to be larger than or equal to rq->nr_running. > Perhaps this: > > if (idle == NOT_IDLE || rq->nr_running > 1) > > should be > > if (idle == NOT_IDLE && rq->nr_running > 1) No, testing for rq->nr_running > 1 is only needed if we are balancing in an idle balance. > Hope this helps, thanks. Yes it does :\ Here is what the patch _should_ have been. (*same warnings with this patch about math demonstration and untested as should have been posted with the earlier one*) Con [-- Attachment #1.2: sched-correct_smp_nice_bias.patch --] [-- Type: text/x-diff, Size: 1720 bytes --] The priority biasing was off by mutliplying the total load by the total priority bias and this ruins the ratio of loads between runqueues. This patch should correct the ratios of loads between runqueues to be proportional to overall load. -2nd attempt. Signed-off-by: Con Kolivas <kernel@kolivas.org> Index: linux-2.6.12-rc6-mm1/kernel/sched.c =================================================================== --- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000 +++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-11 09:55:56.000000000 +1000 @@ -978,7 +978,8 @@ static inline unsigned long __source_loa else source_load = min(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) + if (idle == NOT_IDLE || rq->nr_running > 1) { + unsigned long prio_bias = 1; /* * If we are busy rebalancing the load is biased by * priority to create 'nice' support across cpus. When @@ -987,7 +988,10 @@ static inline unsigned long __source_loa * prevent idle rebalance from trying to pull tasks from a * queue with only one running task. */ - source_load *= rq->prio_bias; + if (rq->nr_running) + prio_bias = rq->prio_bias / rq->nr_running; + source_load *= prio_bias; + } return source_load; } @@ -1011,8 +1015,13 @@ static inline unsigned long __target_loa else target_load = max(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) - target_load *= rq->prio_bias; + if (idle == NOT_IDLE || rq->nr_running > 1) { + unsigned long prio_bias = 1; + + if (rq->nr_running) + prio_bias = rq->prio_bias / rq->nr_running; + target_load *= prio_bias; + } return target_load; } [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-11 0:18 ` Con Kolivas 2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon 1 sibling, 0 replies; 72+ messages in thread From: Con Kolivas @ 2005-06-11 0:18 UTC (permalink / raw) To: linux-kernel Cc: J.A. Magallon, Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin [-- Attachment #1: Type: text/plain, Size: 447 bytes --] On Sat, 11 Jun 2005 09:59, Con Kolivas wrote: > Here is what the patch _should_ have been. (*same warnings with this patch > about math demonstration and untested as should have been posted with the > earlier one*) Ok I booted this patch and all seems fine. Thanks to those that tracked down this regression and the bugs, and apologies for the inconvenience. Looks like Martin's automated testbed is already paying off. Cheers, Con [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 0:18 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-11 0:32 ` J.A. Magallon 2005-06-11 0:48 ` 2.6.12-rc6-mm1 Con Kolivas 1 sibling, 1 reply; 72+ messages in thread From: J.A. Magallon @ 2005-06-11 0:32 UTC (permalink / raw) To: Con Kolivas Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin On 06.11, Con Kolivas wrote: > > Here is what the patch _should_ have been. (*same warnings with this patch > about math demonstration and untested as should have been posted with the > earlier one*) > > + if (idle == NOT_IDLE || rq->nr_running > 1) { > + unsigned long prio_bias = 1; > + if (rq->nr_running) > + prio_bias = rq->prio_bias / rq->nr_running; > + source_load *= prio_bias; > + } > Again... sorry, I don't try to be picky, just want to know if its worth or not... Would not be better something like: if (idle == NOT_IDLE || rq->nr_running > 1) { if (rq->nr_running) source_load = (source_load*rq->prio_bias) / rq->nr_running; } wrt the integer math ? Think of 100*( 5/5) vs 500/5 100*( 6/5) vs 600/5 100*( 7/5) vs 700/5 100*( 8/5) vs 800/5 100*( 9/5) vs 900/5 100*(10/5) vs 1000/5 -- J.A. Magallon <jamagallon()able!es> \ Software is like sex: werewolf!able!es \ It's better when it's free Mandriva Linux release 2006.0 (Cooker) for i586 Linux 2.6.11-jam24 (gcc 4.0.0 (4.0.0-3mdk for Mandriva Linux release 2006.0)) ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon @ 2005-06-11 0:48 ` Con Kolivas 2005-06-11 0:52 ` 2.6.12-rc6-mm1 Con Kolivas 0 siblings, 1 reply; 72+ messages in thread From: Con Kolivas @ 2005-06-11 0:48 UTC (permalink / raw) To: J.A. Magallon Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin [-- Attachment #1: Type: text/plain, Size: 931 bytes --] On Sat, 11 Jun 2005 10:32, J.A. Magallon wrote: > On 06.11, Con Kolivas wrote: > > Here is what the patch _should_ have been. (*same warnings with this > > patch about math demonstration and untested as should have been posted > > with the earlier one*) > > > > + if (idle == NOT_IDLE || rq->nr_running > 1) { > > + unsigned long prio_bias = 1; > > + if (rq->nr_running) > > + prio_bias = rq->prio_bias / rq->nr_running; > > + source_load *= prio_bias; > > + } > > Again... sorry, I don't try to be picky, just want to know if its worth or > not... > > Would not be better something like: > > if (idle == NOT_IDLE || rq->nr_running > 1) { > if (rq->nr_running) > source_load = (source_load*rq->prio_bias) / rq->nr_running; > } I understand your concern, but by definition rq->nr_running will always be a factor of rq->prio_bias so integer math should be fine. Either will do. Cheers, Con [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 0:48 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-11 0:52 ` Con Kolivas 0 siblings, 0 replies; 72+ messages in thread From: Con Kolivas @ 2005-06-11 0:52 UTC (permalink / raw) To: J.A. Magallon Cc: linux-kernel, Ingo Molnar, Martin J. Bligh, Andrew Morton, Christoph Lameter, Nick Piggin On Sat, 11 Jun 2005 10:48, Con Kolivas wrote: > On Sat, 11 Jun 2005 10:32, J.A. Magallon wrote: > > On 06.11, Con Kolivas wrote: > > > Here is what the patch _should_ have been. (*same warnings with this > > > patch about math demonstration and untested as should have been posted > > > with the earlier one*) > > > > > > + if (idle == NOT_IDLE || rq->nr_running > 1) { > > > + unsigned long prio_bias = 1; > > > + if (rq->nr_running) > > > + prio_bias = rq->prio_bias / rq->nr_running; > > > + source_load *= prio_bias; > > > + } > > > > Again... sorry, I don't try to be picky, just want to know if its worth > > or not... > > > > Would not be better something like: > > > > if (idle == NOT_IDLE || rq->nr_running > 1) { > > if (rq->nr_running) > > source_load = (source_load*rq->prio_bias) / rq->nr_running; > > } > > I understand your concern, but by definition rq->nr_running will always be > a factor of rq->prio_bias so integer math should be fine. Either will do. Hmm. No you are right and I'm smoking crack, but integer math should still be accurate enough here. Let me think about the accuracy before spraying more patches like a fool. Cheers, Con ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon @ 2005-06-10 23:50 ` Martin J. Bligh 2005-06-11 4:14 ` 2.6.12-rc6-mm1 Martin J. Bligh 1 sibling, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-10 23:50 UTC (permalink / raw) To: Con Kolivas, linux-kernel Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin >> These tend to run together so just try adding my four patches together. In >> retrospect I guess they're likely candidates because they also change the >> _ratio_ of balance which they should not so they are buggy as a group >> currently. Easy enough to fix but it will make it easy to pinpoint the >> problem if they're responsible. >> >> sched-implement-nice-support-across-physical-cpus-on-smp.patch >> sched-change_prio_bias_only_if_queued.patch >> sched-account_rt_tasks_in_prio_bias.patch >> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch > > By the way it has already been decided to remove these patches from -mm > pending the completion of current scheduler work. If they turn out to be > responsible for this regression I apologise profusely :-|. > > It is clearer to me now that I have made a mistake with the priority biasing, > and the following patch corrects it to the planned behaviour. This is > academic at this stage as we won't be looking at this particular feature > again in earnest until the other 32 scheduler patches (and any followups) go > upstream. > > It's already known that schedstats data will be off without further code to > understand smp nice as well (thanks Nick for pointing out the data)... more > academic stuff but obviously something to consider when/if we get there. OK, I backed out those 4, and the degredation mostly went away. See http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png and more specifically, see the +p5150 near the right hand side. I don't think it's quite as good as mainline, but much closer. I did this run with HZ=1000, and the the one with no scheduler patches at all with HZ=250, so I'll try to do a run that's more directly comparable as well Thanks, M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-11 4:14 ` Martin J. Bligh 2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas 0 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-11 4:14 UTC (permalink / raw) To: Con Kolivas, linux-kernel Cc: Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 16:50:40 -0700): >>> These tend to run together so just try adding my four patches together. In >>> retrospect I guess they're likely candidates because they also change the >>> _ratio_ of balance which they should not so they are buggy as a group >>> currently. Easy enough to fix but it will make it easy to pinpoint the >>> problem if they're responsible. >>> >>> sched-implement-nice-support-across-physical-cpus-on-smp.patch >>> sched-change_prio_bias_only_if_queued.patch >>> sched-account_rt_tasks_in_prio_bias.patch >>> sched-smp-nice-bias-busy-queues-on-idle-rebalance.patch >> >> By the way it has already been decided to remove these patches from -mm >> pending the completion of current scheduler work. If they turn out to be >> responsible for this regression I apologise profusely :-|. >> >> It is clearer to me now that I have made a mistake with the priority biasing, >> and the following patch corrects it to the planned behaviour. This is >> academic at this stage as we won't be looking at this particular feature >> again in earnest until the other 32 scheduler patches (and any followups) go >> upstream. >> >> It's already known that schedstats data will be off without further code to >> understand smp nice as well (thanks Nick for pointing out the data)... more >> academic stuff but obviously something to consider when/if we get there. > > OK, I backed out those 4, and the degredation mostly went away. > See http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png > > and more specifically, see the +p5150 near the right hand side. > I don't think it's quite as good as mainline, but much closer. > I did this run with HZ=1000, and the the one with no scheduler > patches at all with HZ=250, so I'll try to do a run that's more > directly comparable as well OK, that makes it look much more like mainline. Looks like you were still revising the details of your patch Con ... once you're ready, drop me a URL for it, and I'll make the system whack on that too. M. PS. Hmmm. I need to get better at identifying what +p5150 means in the graphs, etc ;-( Maybe HTML explanation with embedded png image. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 4:14 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-11 5:22 ` Con Kolivas 2005-06-11 5:56 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 2 replies; 72+ messages in thread From: Con Kolivas @ 2005-06-11 5:22 UTC (permalink / raw) To: Martin J. Bligh Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin [-- Attachment #1: Type: text/plain, Size: 1126 bytes --] On Sat, 11 Jun 2005 14:14, Martin J. Bligh wrote: > --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 > > OK, I backed out those 4, and the degredation mostly went away. > > See > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench. > >moe.png > > > > and more specifically, see the +p5150 near the right hand side. > > I don't think it's quite as good as mainline, but much closer. > > I did this run with HZ=1000, and the the one with no scheduler > > patches at all with HZ=250, so I'll try to do a run that's more > > directly comparable as well > > OK, that makes it look much more like mainline. Looks like you were still > revising the details of your patch Con ... once you're ready, drop me a > URL for it, and I'll make the system whack on that too. Great thanks. Here are rolled up all the reconsidered changes that apply directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very grateful to see how this performed; it has been boot and stress tested at this end. If it shows detriment I'll have to make the smp nice changes more complex. Cheers, Con [-- Attachment #2: 2.6.12-rc6-mm1-mjbtest.patch --] [-- Type: text/x-diff, Size: 1253 bytes --] Index: linux-2.6.12-rc6-mm1/kernel/sched.c =================================================================== --- linux-2.6.12-rc6-mm1.orig/kernel/sched.c 2005-06-10 23:56:56.000000000 +1000 +++ linux-2.6.12-rc6-mm1/kernel/sched.c 2005-06-11 11:48:09.000000000 +1000 @@ -978,7 +978,7 @@ static inline unsigned long __source_loa else source_load = min(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) + if (rq->nr_running > 1 || (idle == NOT_IDLE && rq->nr_running)) /* * If we are busy rebalancing the load is biased by * priority to create 'nice' support across cpus. When @@ -987,7 +987,7 @@ static inline unsigned long __source_loa * prevent idle rebalance from trying to pull tasks from a * queue with only one running task. */ - source_load *= rq->prio_bias; + source_load = source_load * rq->prio_bias / rq->nr_running; return source_load; } @@ -1011,8 +1011,8 @@ static inline unsigned long __target_loa else target_load = max(cpu_load, load_now); - if (idle == NOT_IDLE || rq->nr_running > 1) - target_load *= rq->prio_bias; + if (rq->nr_running > 1 || (idle == NOT_IDLE && rq->nr_running)) + target_load = target_load * rq->prio_bias / rq->nr_running; return target_load; } ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-11 5:56 ` Martin J. Bligh 2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh 1 sibling, 0 replies; 72+ messages in thread From: Martin J. Bligh @ 2005-06-11 5:56 UTC (permalink / raw) To: Con Kolivas Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin --Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 15:22:30 +1000): > On Sat, 11 Jun 2005 14:14, Martin J. Bligh wrote: >> --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 > > > OK, I backed out those 4, and the degredation mostly went away. >> > See >> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench. >> > moe.png >> > >> > and more specifically, see the +p5150 near the right hand side. >> > I don't think it's quite as good as mainline, but much closer. >> > I did this run with HZ=1000, and the the one with no scheduler >> > patches at all with HZ=250, so I'll try to do a run that's more >> > directly comparable as well >> >> OK, that makes it look much more like mainline. Looks like you were still >> revising the details of your patch Con ... once you're ready, drop me a >> URL for it, and I'll make the system whack on that too. > > Great thanks. Here are rolled up all the reconsidered changes that apply > directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very > grateful to see how this performed; it has been boot and stress tested at > this end. If it shows detriment I'll have to make the smp nice changes more > complex. Kicked it off - should appear in a few hours as http://mbligh.org/abat/con_sched_test M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 5:56 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-11 20:13 ` Martin J. Bligh 2005-06-11 22:20 ` 2.6.12-rc6-mm1 Con Kolivas 1 sibling, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-11 20:13 UTC (permalink / raw) To: Con Kolivas Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin --Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 15:22:30 +1000): > On Sat, 11 Jun 2005 14:14, Martin J. Bligh wrote: >> --"Martin J. Bligh" <mbligh@mbligh.org> wrote (on Friday, June 10, 2005 > > > OK, I backed out those 4, and the degredation mostly went away. >> > See >> > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench. >> > moe.png >> > >> > and more specifically, see the +p5150 near the right hand side. >> > I don't think it's quite as good as mainline, but much closer. >> > I did this run with HZ=1000, and the the one with no scheduler >> > patches at all with HZ=250, so I'll try to do a run that's more >> > directly comparable as well >> >> OK, that makes it look much more like mainline. Looks like you were still >> revising the details of your patch Con ... once you're ready, drop me a >> URL for it, and I'll make the system whack on that too. > > Great thanks. Here are rolled up all the reconsidered changes that apply > directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very > grateful to see how this performed; it has been boot and stress tested at > this end. If it shows detriment I'll have to make the smp nice changes more > complex. It's much better ... but still a degredation - see point p5181 on: http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.moe.png Only really seems to hurt the NUMA box (the x440 one ... elm3b67 ... is still trying to find it's ass with both hands). I'm not necessarily saying it's a problem ... not sure what the benefits of the patch are, but it's a data point, at least ? M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-11 22:20 ` Con Kolivas 2005-06-11 23:27 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 1 reply; 72+ messages in thread From: Con Kolivas @ 2005-06-11 22:20 UTC (permalink / raw) To: Martin J. Bligh Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin On Sun, 12 Jun 2005 06:13, Martin J. Bligh wrote: > --Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 > > Great thanks. Here are rolled up all the reconsidered changes that apply > > directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very > > grateful to see how this performed; it has been boot and stress tested at > > this end. If it shows detriment I'll have to make the smp nice changes > > more complex. > > It's much better ... but still a degredation - see point p5181 on: > > http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.mo >e.png > > Only really seems to hurt the NUMA box (the x440 one ... elm3b67 ... is > still trying to find it's ass with both hands). I'm not necessarily saying > it's a problem ... not sure what the benefits of the patch are, but it's a > data point, at least ? Thanks a lot! Just checking the numbering of the test runs with you. This is the blue line order as plotted on the graph: 5181 is with this patch 4947 is mm1? 5150 is mm1 with the 4 patches backed out 5081 is mm1 with the 4 patches backed out and Hz changed to 100? 5169 is ? Cheers, Con ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 22:20 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-11 23:27 ` Martin J. Bligh 2005-06-11 23:47 ` 2.6.12-rc6-mm1 Con Kolivas 0 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-11 23:27 UTC (permalink / raw) To: Con Kolivas Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin --Con Kolivas <kernel@kolivas.org> wrote (on Sunday, June 12, 2005 08:20:05 +1000): > On Sun, 12 Jun 2005 06:13, Martin J. Bligh wrote: >> --Con Kolivas <kernel@kolivas.org> wrote (on Saturday, June 11, 2005 > > > Great thanks. Here are rolled up all the reconsidered changes that apply >> > directly to 2.6.12-rc6-mm1 -purely for testing purposes-. I'd be very >> > grateful to see how this performed; it has been boot and stress tested at >> > this end. If it shows detriment I'll have to make the smp nice changes >> > more complex. >> >> It's much better ... but still a degredation - see point p5181 on: >> >> http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/perf/kernbench.mo >> e.png >> >> Only really seems to hurt the NUMA box (the x440 one ... elm3b67 ... is >> still trying to find it's ass with both hands). I'm not necessarily saying >> it's a problem ... not sure what the benefits of the patch are, but it's a >> data point, at least ? > > Thanks a lot! > > Just checking the numbering of the test runs with you. This is the blue line > order as plotted on the graph: > > 5181 is with this patch > 4947 is mm1? > 5150 is mm1 with the 4 patches backed out > 5081 is mm1 with the 4 patches backed out and Hz changed to 100? > 5169 is ? Until I get off my ass and write an html wrapper for the graphs, easiest thing to do is just cross-reference to here: http://ftp.kernel.org/pub/linux/kernel/people/mbligh/abat/regression_matrix.html The +pXXXX numbers on the graph match the job numbers in the boxes. You can click on the patches down the left side, and see exactly what they were if you want. M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 23:27 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-11 23:47 ` Con Kolivas 2005-06-12 0:23 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 1 reply; 72+ messages in thread From: Con Kolivas @ 2005-06-11 23:47 UTC (permalink / raw) To: Martin J. Bligh Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin [-- Attachment #1: Type: text/plain, Size: 1001 bytes --] On Sun, 12 Jun 2005 09:27, Martin J. Bligh wrote: > >> not sure what the benefits of the patch are, I should have answered this. Since we moved to one runqueue per cpu with the current scheduler, 'nice' levels basically fall apart on SMP. Balancing tends to group together all the wrong tasks to have any meaningful 'nice' support where often on a 2 cpu machine if we run 4 tasks, 2 nice 0 and 2 nice 19 we end up with: cpu 1: nice 19 + nice 19 cpu 2: nice 0 + nice 0 which means each nice 19 task gets half a cpu and each nice 0 task gets half a cpu which is lousy fairness. The smp nice patches should end up with cpu 1: nice 0 + nice 19 cpu 2: nice 0 + nice 19 so that the nice 0 tasks get 95% of a cpu and nice 19 tasks get 5% of a cpu. The patches should balance things as fairly as possible according to nice levels across cpus. As you can see this is clearly a bug in behaviour and has been a showstopper for many trying to move from 2.4. Cheers, Con [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-11 23:47 ` 2.6.12-rc6-mm1 Con Kolivas @ 2005-06-12 0:23 ` Martin J. Bligh 2005-06-12 5:19 ` 2.6.12-rc6-mm1 Con Kolivas 0 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-12 0:23 UTC (permalink / raw) To: Con Kolivas Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin --Con Kolivas <kernel@kolivas.org> wrote (on Sunday, June 12, 2005 09:47:08 +1000): > On Sun, 12 Jun 2005 09:27, Martin J. Bligh wrote: >> >> not sure what the benefits of the patch are, > > I should have answered this. Since we moved to one runqueue per cpu with the > current scheduler, 'nice' levels basically fall apart on SMP. Balancing tends > to group together all the wrong tasks to have any meaningful 'nice' support > where often on a 2 cpu machine if we run 4 tasks, 2 nice 0 and 2 nice 19 we > end up with: > > cpu 1: nice 19 + nice 19 > cpu 2: nice 0 + nice 0 > > which means each nice 19 task gets half a cpu and each nice 0 task gets half a > cpu which is lousy fairness. > > The smp nice patches should end up with > cpu 1: nice 0 + nice 19 > cpu 2: nice 0 + nice 19 > > so that the nice 0 tasks get 95% of a cpu and nice 19 tasks get 5% of a cpu. > > The patches should balance things as fairly as possible according to nice > levels across cpus. As you can see this is clearly a bug in behaviour and has > been a showstopper for many trying to move from 2.4. Oh, right. that makes a lot of sense ... maybe just let it have an error factor when migrating cross numa nodes (ie not be as strict)? Not sure that's really the problem, as I doubt anything in my test is actually niced anyway (assuming you're meaning static prio, not dynamic). In that case, your changes should have no effect, right (from explanation, not looking at the code ;-)) M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-12 0:23 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-12 5:19 ` Con Kolivas 0 siblings, 0 replies; 72+ messages in thread From: Con Kolivas @ 2005-06-12 5:19 UTC (permalink / raw) To: Martin J. Bligh Cc: linux-kernel, Ingo Molnar, Andrew Morton, Christoph Lameter, Nick Piggin On Sun, 12 Jun 2005 10:23, Martin J. Bligh wrote: > --Con Kolivas <kernel@kolivas.org> wrote (on Sunday, June 12, 2005 09:47:08 +1000): > > The patches should balance things as fairly as possible according to nice > > levels across cpus. As you can see this is clearly a bug in behaviour and > > has been a showstopper for many trying to move from 2.4. > > Oh, right. that makes a lot of sense ... maybe just let it have an error > factor when migrating cross numa nodes (ie not be as strict)? Not sure > that's really the problem, as I doubt anything in my test is actually > niced anyway (assuming you're meaning static prio, not dynamic). In that > case, your changes should have no effect, right (from explanation, not > looking at the code ;-)) The balancing code is not really aware that the loads being returned are being altered and it was not clear whether this would be needed or not as it usually bases its decisions on ratios of load rather than absolute amounts. The tricky part is idle balancing where we don't want to try and pull from a queue that only has one running task and the patch has a "if single task running and idle balancing tell it only one task running and don't bias" feature. This may cause slight performance effects on numa as I guess the other nodes suddenly seem much more loaded and we normally wouldn't try balancing between nodes until there was a larger load discrepancy than between cpus. I'll think on this and see how much more nice-aware the balancing code needs to be for this to not have any effect. Cheers, Con ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter 2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-09 1:58 ` Lee Revell 1 sibling, 0 replies; 72+ messages in thread From: Lee Revell @ 2005-06-09 1:58 UTC (permalink / raw) To: Christoph Lameter; +Cc: Andrew Morton, Martin J. Bligh, linux-kernel On Tue, 2005-06-07 at 17:02 -0700, Christoph Lameter wrote: > On Tue, 7 Jun 2005, Andrew Morton wrote: > > > > Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). > > > > Oh crap, so it does. That's wrong. > > Email by you and Linus indicated that 250 should be the default. Wait, does that mean the default HZ is going to be changed in the 2.6.x timeframe? That's a big user-visible regression, as it makes the sleep() resolution worse, and would force apps with tight timing requirements to go back to using the RTC like on 2.4. Unless, of course, the plan is to merge the high-res timers patch at the same time. Lee ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter @ 2005-06-08 0:02 ` Martin J. Bligh 1 sibling, 0 replies; 72+ messages in thread From: Martin J. Bligh @ 2005-06-08 0:02 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Christoph Lameter >> Diffprofile is wacko (HZ seems to be defaulting to 250 in -mm). > > Oh crap, so it does. That's wrong. > >> If I factor it by 4x, I get: > > Would it be possible to set it back to 100Hz, retest? Sure. but you mean 1000, right? M. ^ permalink raw reply [flat|nested] 72+ messages in thread
* 2.6.12-rc6-mm1
@ 2005-06-07 11:29 Andrew Morton
2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander
` (7 more replies)
0 siblings, 8 replies; 72+ messages in thread
From: Andrew Morton @ 2005-06-07 11:29 UTC (permalink / raw)
To: linux-kernel
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/
- Added v9fs
- Various random fixes
- Probably a similar number of breakages
Changes since 2.6.12-rc5-mm2:
-fix-ide-scsi-eh-locking.patch
-ext3-fix-log_do_checkpoint-assertion-failure.patch
-ext3-fix-list-scanning-in-__cleanup_transaction.patch
-namei-fixes-01-19.patch
-namei-fixes-02-19.patch
-namei-fixes-03-19.patch
-namei-fixes-04-19.patch
-namei-fixes-05-19.patch
-namei-fixes-06-19.patch
-namei-fixes-07-19.patch
-namei-fixes-08-19.patch
-namei-fixes-09-19.patch
-namei-fixes-10-19.patch
-namei-fixes-11-19.patch
-namei-fixes-12-19.patch
-namei-fixes-13-19.patch
-namei-fixes-14-19.patch
-namei-fixes-15-19.patch
-namei-fixes-16-19.patch
-namei-fixes-17-19.patch
-namei-fixes-18-19.patch
-namei-fixes-19-19.patch
-ipmi-class_simple-fixes.patch
-gregkh-i2c-i2c-ali1563.patch
-git-ocfs-fix-for-shemminger-tcp-stuff.patch
-gregkh-pci-pci-hotplug-shpchp-_HPP-fix.patch
-gregkh-pci-pci-hotplug-shpchp-PERR-fix.patch
-gregkh-pci-pci-amd74xx-ids.patch
-gregkh-pci-pci-cpci-update.patch
-gregkh-usb-usb-sl811-hcd-fixes.patch
-gregkh-usb-usb-sl811_cs.patch
-gregkh-usb-usb-ftdi_sio-new-id.patch
-gregkh-usb-usb-serial-generic-init-fix.patch
-gregkh-usb-usb-ub_multi_lun.patch
-gregkh-usb-usb-remove_pwc_changelog.patch
-gregkh-usb-usb-add-new-wacom-device-to-usb-hid-core-list.patch
-gregkh-usb-usb-urb_documentation.patch
-gregkh-usb-usb-earthmate-hid-blacklist.patch
-gregkh-usb-usb-storage-trumpion.patch
-gregkh-usb-usb-modalias-shrink.patch
-gregkh-usb-usb-cp2101-flow-control.patch
-gregkh-usb-usb-usbatm-reduce-log-spam.patch
-gregkh-usb-usb-usbatm-avoid-oops-on-bind-failure.patch
-gregkh-usb-usb-usbatm-1-fix.patch
-usb-option-card-driver.patch
-usb-wacom-tablet-driver.patch
-atm-nicstar-remove-a-bunch-of-pointless-casts-of-null.patch
-fix-atm-build-with-o=.patch
-drivers-net-hamradio-baycom_eppc-cleanups.patch
-ppc32-apple-device-tree-bug-fix.patch
-ppc32-ppc64-cleanup-proc-device-tree.patch
-ppc64-cleanup-spr-definitions.patch
-ppc64-cleanup-iseries-runlight-support.patch
-ppc64-remove-decr_overclock.patch
-ppc64-fix-a-device-tree-bug-on-apples.patch
-i386-collect-host-bridge-resources.patch
-x86_64-collect-host-bridge-resources.patch
-allow-ev_abs-to-work-in-uinputc.patch
-serial-update-nec-vr4100-series-serial-support.patch
Merged
+ppc32-add-linux-compilerh-to-asm-sigcontexth.patch
+include-linux-configh-before-testing-config_acpi.patch
+uml-make-the-emulated-iomem-driver-work-on-26.patch
+uml-compile-fixes-for-gcc-4.patch
+uml-fix-strace-f.patch
+uml-clean-up-error-path.patch
+uml-link-tt-mode-against-nptl.patch
+send_ipi_mask_sequence-warning-fix.patch
+ppc32-add-405ep-cpu_spec-entry.patch
+input-disable-scroll-feature-on-at-keyboards.patch
Planned for 2.6.12
+x86_64-task_size-fixes-for-compatibility-mode-processes.patch
x86_64 critical fixes (needs work)
+ia64-disable-preempt.patch
Disable CONFIG_PREEMPT on ia64 (it has problem with floating-point
save/restore)
+fix-up-macro-abuse-in-drivers-acpi-sleep-procc.patch
ACPI cleanup
+git-arm.patch
+git-arm-smp.patch
ARM git trees
-git-cpufreq.patch
Empty
+fix-warning-in-powernow-k8c.patch
Fix a cpufreq warning
+gregkh-driver-ipmi-class_simple-fixes.patch
+gregkh-driver-sysfs-permissions-01.patch
+gregkh-driver-sysfs-permissions-02.patch
+gregkh-driver-sysfs-permissions-03.patch
+gregkh-driver-dont-loose-devices-on-suspend-failure.patch
New driver core patches
-bk-drm.patch
-bk-drm-via.patch
DRM is moving to git
-update-drm-ioctl-compatibility-to-new-world-order.patch
The code which this pathces isn't there any more (it will come back)
+git-drm-initmap.patch
+git-drm-via.patch
Some DRM git trees
+gregkh-i2c-i2c-Kconfig-update.patch
+gregkh-i2c-i2c-pcf8574-cleanup.patch
+gregkh-i2c-i2c-adm9240-docs.patch
+gregkh-i2c-i2c-device-attr-lm90.patch
+gregkh-i2c-i2c-device-attr-lm83.patch
+gregkh-i2c-i2c-device-attr-lm63.patch
+gregkh-i2c-i2c-device-attr-it87.patch
+gregkh-i2c-hwmon-01.patch
+gregkh-i2c-hwmon-02.patch
+gregkh-i2c-hwmon-03.patch
i2c tree updates
+i2c-chips-need-hwmon.patch
+gregkh-i2c-hwmon-02-sparc64-fix.patch
Fix a few things in the i2c tree
+sonypi-make-sure-that-input_work-is-not-running-when-unloading.patch
sonypi fix
-git-libata-adma.patch
-git-libata-ahci-msi.patch
-git-libata-bridge-detect.patch
-git-libata-chs-support.patch
-git-libata-docs.patch
-git-libata-svw.patch
-git-libata-promise-sata-pata.patch
-git-libata-pdc2027x.patch
Dropped the libata tree - it changes all the time and I can't wqork out wtf
is going on.
-git-netdev-r8169.patch
Too many rejects from this one.
+fix-recursive-ipw2200-dependencies.patch
+drivers-net-chelsio-cxgb2-use-the-dma_3264bit_mask-constants.patch
+drivers-net-wireless-ipw2100-use-the-dma_32bit_mask-constant.patch
+drivers-net-wireless-ipw2200-use-the-dma_32bit_mask-constant.patch
+fix-tulip-suspend-resume.patch
Net driver fixes
+scalable-tcp-cleaned.patch
"scalable TCP"
+git-serial.patch
Serial subsystem tree
+gregkh-pci-pci-fix-routing-in-parent-bridge.patch
+gregkh-pci-pci-dma-bursting-advice.patch
+gregkh-pci-pci-collect-host-bridge-resources-01.patch
+gregkh-pci-pci-collect-host-bridge-resources-02.patch
PCI subsystem tree updates
+gregkh-pci-pci-dma-bursting-advice-fix.patch
Fix it
-git-scsi-rc-fixes.patch
This is empty
+gregkh-usb-usb-usbatm-reduce-log-spam.patch
+gregkh-usb-usb-usbatm-avoid-oops-on-bind-failure.patch
+gregkh-usb-usb-usbatm-fix-gcc-2.95.x.patch
+gregkh-usb-usb-usbatm-kcalloc.patch
+gregkh-usb-usb-uhci-detect-invalid-ports.patch
+gregkh-usb-usb-export-getput_intf.patch
+gregkh-usb-usb-cdc-acm-reference-count-fix.patch
+gregkh-usb-usb-ehci-fix-page-pointer-allocate.patch
+gregkh-usb-usb-wireless-definitions.patch
+gregkh-usb-usb-usblp-race-fix.patch
+gregkh-usb-usb-stv680-creative-mini.patch
+gregkh-usb-usb-atiremote-sysfs-links.patch
+gregkh-usb-usb-gotemp.patch
USB tree updates
+sparsemem-memory-model-fix-4.patch
+sparsemem-memory-model-fix-5.patch
Fix sparsemem-memory-model.patch even more
+sparsemem-hotplug-base-fix.patch
Fix sparsemem-hotplug-base.patch
-vm-merge_lru_pages.patch
-vm-page-cache-reclaim-core.patch
-vm-page-cache-reclaim-core-tidy.patch
-vm-reclaim_page_cache_node-syscall.patch
-vm-reclaim_page_cache_node-syscall-x86.patch
-vm-automatic-reclaim-through-mempolicy.patch
+vm-add-may_swap-flag-to-scan_control.patch
+vm-early-zone-reclaim.patch
+vm-early-zone-reclaim-tidy.patch
+vm-add-__gfp_noreclaim.patch
+vm-rate-limit-early-reclaim.patch
These patches were updated
+node-local-per-cpu-pages-tidy-2-fix.patch
Fix node-local-per-cpu-pages.patch some more.
+avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch
Fix a patch clash
+__mod_page_state-pass-unsigned-long-instead-of-unsigned.patch
+__read_page_state-pass-unsigned-long-instead-of-unsigned.patch
Warning fixes
+add-oom-debug.patch
Additional debug output when the box goes oom.
+periodically-drain-non-local-pagesets.patch
+periodically-drain-non-local-pagesets-fix.patch
Shrink the per-cpu-pages caches occasionally
+ia64-uncached-alloc.patch
+sn2-xpc-build-patches.patch
Special allocator for uncached pages
+shmem-restore-superblock-info.patch
+mbind-fix-verify_pages-pte_page.patch
+mbind-check_range-use-standard-ptwalk.patch
+dup_mmap-update-comment-on-new-vma.patch
+bad_page-clear-reclaim-and-slab.patch
+rme96xx-fix-pagereserved-range.patch
+get_user_pages-kill-get_page_map.patch
+do_wp_page-cannot-share-file-page.patch
+can_share_swap_page-use-page_mapcount.patch
+msync-check-pte-dirty-earlier.patch
Various mm fixes
+sunzilog-warning-fixes.patch
+ppp-handle-misaligned-accesses.patch
Net fixes
+ppc32-removed-dependency-on-config_cpm2-for-building.patch
+ppc32-converted-mpc10x-bridge-to-use-platform.patch
+cpm_uart-route-scc2-pins-for-the-stx-gp3-board.patch
ppc32 updates
+ppc64-iseries-remove-iseries_proch.patch
+ppc64-iseries-header-file-white-space-cleanups.patch
+ppc64-iseries-more-header-file-white-space-cleanups.patch
+ppc64-iseries-obvious-code-simplifications.patch
+ppc64-iseries-remove-lpardatah.patch
+ppc64-iseries-eliminate-some-unused-inline-functions.patch
+ppc64-iseries-remove-hvcallcfgh.patch
+ppc64-iseries-cleanup-itlpqueueh.patch
+ppc64-iseries-tidy-up-some-includes-and-hvcallh.patch
+ppc64-iseries-misc-header-cleanups.patch
+update-ppc64-defconfig.patch
+ppc64-iseries-remove-iseries_pci_resetc.patch
+ppc64-iseries-iommuh-cleanups.patch
+ppc64-iseries-iseries_vpdinfoc-cleanups.patch
+ppc64-iseries-iseries_pcih-cleanups.patch
+ppc64-iseries-remove-ioretry-from-iseries_device_node.patch
+ppc64-iseries-remove-some-more-members-of.patch
ppc64 updates
+x86-x86_64-pcibus_to_node-fix.patch
Fix x86-x86_64-pcibus_to_node.patch
+mempool-bounce-buffer-restriction.patch
Limit the amount of memory which can be used for bounce buffers
+arm-irqs_disabled-type-fix.patch
ARM warning fix
+variable-overflow-after-hundreds-round-of-hotplug-cpu.patch
CPU hotplug fix
+x86_64-change-init-sections-for-cpu-hotplug-support.patch
+x86_64-change-init-sections-for-cpu-hotplug-support-fix.patch
+x86_64-cpu-hotplug-support.patch
+x86_64-cpu-hotplug-sibling-map-cleanup.patch
+x86_64-dont-use-broadcast-shortcut-to-make-it-cpu-hotplug-safe.patch
+x86_64-provide-ability-to-choose-using-shortcuts-for-ipi-in-flat-mode.patch
CPU hotplug for x86_64
+m32r-support-m3a-2170mappi-iii-platform-fix.patch
+m32r-support-m3a-2170mappi-iii-platform-fix-2.patch
+m32r-update-setup_xxxxxc.patch
+m32r-update-m32r_cfc-to-support-mappi-iii-fix.patch
+m32r-cleanup-arch-m32r-mm-extablec.patch
+m32r-remove-include-asm-m32r-m32102perih.patch
+m32r-update-defconfig-files.patch
+m32r-use-asm-generic-div64h.patch
m32r fixes and updates
+s390-cio-max-channels-checks.patch
+s390-cio-documentation.patch
+s390-ifdefs-in-compat_ioctls.patch
+s390-kernel-stack-overflow-panic.patch
+s390-cmm-sender-parameter-visibility.patch
+s390-memory-detection-32gb.patch
+s390-pending-interrupt-after-ipl-from-reader.patch
s/390 updates
+ecryptfs-export-user-key-type.patch
Export a symbol
+x86_64-specific-function-return-probes.patch
+kprobes-ia64-cleanup-2.patch
+kprobes-ia64-cmp-ctype-unc-support.patch
+kprobes-ia64-safe-register-kprobe.patch
+kprobes-temporary-disarming-of-reentrant-probe-for-x86_64-fix.patch
+allow-a-jprobe-to-coexist-with-muliple-kprobes.patch
kprobes updates
+cs4236-irq-handling-fix.patch
OSS driver fix
+block-add-unlocked_ioctl-support-for-block-devices.patch
Support lock_kernel-less ioctls on blockdevs
+pcdp-handle-tables-that-dont-supply-baud-rate.patch
serial driver update
+stop-arch-i386-kernel-vsyscall-noteo-being-rebuilt-every-time.patch
kbuild fix
+remove-f_error-field-from-struct-file.patch
cleanup
+autofs4-avoid-panic-on-bind-mount-of-autofs-owned-directory.patch
+autofs4-post-expire-race-fix.patch
+autofs4-bad-lookup-fix.patch
+autofs4-subversion-bump-to-identify-these-changes.patch
autofs4 updates
+rapidio-support-core-base.patch
+rapidio-support-core-includes.patch
+rapidio-support-core-enum.patch
+rapidio-support-ppc32.patch
+rapidio-support-net-driver.patch
RapidIO driver
+dlm-lockspaces-callbacks-directory-dlm-consistent-ifdefs.patch
+dlm-lockspaces-callbacks-directory-fix-2-dlm-dont-repeat-include.patch
+dlm-lockspaces-callbacks-directory-fix-3.patch
+dlm-lockspaces-callbacks-directory-dlm-dont-free-lvb-twice.patch
+dlm-communication-dlm-dont-add-duplicate-node-addresses.patch
+dlm-recovery-dlm-timer-cant-be-global.patch
+dlm-recovery-dlm-clear-recovery-flags.patch
+dlm-device-interface-dlm-uncomment-unregister_lockspace.patch
+dlm-device-interface-dlm-newline-in-printks.patch
+dlm-debug-fs-dlm-consistent-ifdefs.patch
Various fixes and updates to the DLM driver
+tuner-corec-improvments-and-ymec-tvision-tvf8533mf.patch
v4l udpate
+oprofile-report-anonymous-region-samples.patch
oprofile feature
+lockd-flush-signals-on-shutdown.patch
+nfs4-hold-filp-while-reading-or-writing.patch
+nfsd4-fix-probe_callback.patch
+nfsd4-nfs4_check_open_reclaim-cleanup.patch
+nfsd4-create-separate-laundromat-workqueue.patch
+nfsd4-simplify-lease-changing.patch
+nfsd4-delegation-recovery.patch
+nfsd4-rename-nfs4_state_init.patch
+nfsd4-clean-up-state-initialization.patch
+nfsd4-remove-nfs4_reclaim_init.patch
+nfsd4-idmap-initialization.patch
+nfsd4-setclientid-simplification.patch
+nfsd4-reboot-hash.patch
+nfsd4-add-find_unconf_by_str-functions-to-simplify-setclientid.patch
+nfsd4-grace-period-end.patch
+nfsd4-make-needlessly-global-code-static.patch
+nfsd4-fix-uncomfirmed-list.patch
+nfsd4-fix-setclientid_confirm-cases.patch
+nfsd4-fix-setclientid_confirm-error-return.patch
+nfsd4-setclientid_confirm-gotoectomy.patch
+nfsd4-setclientid_confirm-comments.patch
+nfsd4-miscellaneous-setclientid_confirm-cleanup.patch
+nfsd4-rename-state-list-fields.patch
+nfsd4-allow-multiple-lockowners.patch
+nfsd4-remove-cb_parsed.patch
+nfsd4-initialize-recovery-directory.patch
+nfsd4-reboot-recovery.patch
+nfsd4-reboot-dirname.patch
nfsd updates
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags.patch
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags-tidy.patch
+isofs-show-hidden-files-add-granularity-for-assoc-hidden-files-flags-fix.patch
isofs feature work
+numa-aware-slab-allocator-v5.patch
The NUMA-aware slab allocator is back. Needs ifdef-reduction work.
-periodically-scan-redzone-entries-and-slab-control-structures.patch
-slab-leak-detector.patch
-slab-leak-detector-warning-fixes.patch
It broke these.
+numa-aware-slab-allocator-v3-__bad_size-fix.patch
Fix it.
+sched-run-sched_normal-tasks-with-real-time-tasks-on-smt-siblings.patch
CPU scheduler fix
+v4l-add-support-for-pixelview-ultra-pro.patch
+dvico-fusionhdtv3-gold-t-documentation-fix.patch
v4l updates
+kexec-code-cleanup.patch
Make all the kexec patches resemble CodingStyle.
+v9fs-documentation-makefiles-configuration.patch
+v9fs-documentation-makefiles-configuration-fix.patch
+v9fs-vfs-file-dentry-and-directory-operations.patch
+v9fs-vfs-file-dentry-and-directory-operations-fix.patch
+v9fs-vfs-inode-operations.patch
+v9fs-vfs-superblock-operations-and-glue.patch
+v9fs-9p-protocol-implementation.patch
+v9fs-transport-modules.patch
+v9fs-debug-and-support-routines.patch
+v9fs-debug-and-support-routines-fix.patch
The plan9 networked filesystem
+framebuffer-driver-for-arc-lcd-board.patch
+framebuffer-driver-for-arc-lcd-board-tidy.patch
+framebuffer-driver-for-arc-lcd-board-update.patch
+new-pci-id-for-chipsfb.patch
fbdev updates
+modules-add-version-and-srcversion-to-sysfs-fix.patch
+modules-add-version-and-srcversion-to-sysfs-fix-2.patch
Fix modules-add-version-and-srcversion-to-sysfs.patch
+fuse-device-functions-fuse-serious-information-leak-fix.patch
FUSE fix
+remove-redundant-info-from-submittingpatches.patch
Documentation update
-unexport-slab_reclaim_pages.patch
Drop this due to some reject.
number of patches in -mm: 1397
number of changesets in external trees: 53
number of patches in -mm only: 1395
total patches: 1448
All 1397 patches:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/patch-list
^ permalink raw reply [flat|nested] 72+ messages in thread* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-07 14:24 ` Wolfgang Wander 2005-06-07 14:49 ` 2.6.12-rc6-mm1 Wolfgang Wander 2005-06-07 14:48 ` 2.6.12-rc6-mm1 Brice Goglin ` (6 subsequent siblings) 7 siblings, 1 reply; 72+ messages in thread From: Wolfgang Wander @ 2005-06-07 14:24 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Chen, Kenneth W Andrew Morton wrote: > +avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch As a heads-up. This one breaks the fragmentation reduction patch in 32 bit emulation mode. Our test case shows the standard 17 fragmented regions in /proc/self/maps (as in the 2.6 standard kernel) vs the 2 regions in 2.6.12-rc5-mm2 (and before). Somehow the new way of detecting 32 bit remulation mode seems to fail here. I'll try to figure out a fix. Wolfgang ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander @ 2005-06-07 14:49 ` Wolfgang Wander 0 siblings, 0 replies; 72+ messages in thread From: Wolfgang Wander @ 2005-06-07 14:49 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Chen, Kenneth W [-- Attachment #1: Type: text/plain, Size: 1053 bytes --] Wolfgang Wander wrote: > Andrew Morton wrote: > >> +avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes.patch >> > > > As a heads-up. > > This one breaks the fragmentation reduction patch in 32 bit emulation mode. > Our test case shows the standard 17 fragmented regions in > /proc/self/maps (as in > the 2.6 standard kernel) vs the 2 regions in 2.6.12-rc5-mm2 (and before). > > Somehow the new way of detecting 32 bit remulation mode seems to fail here. > > I'll try to figure out a fix. > Here is one possibility: Since rc6 the difference between TASK_UNMAPPED_64 and TASK_UNMAPPED_32 is gone and both are now merged into TASK_UNMAPPED_BASE. Therefore we can no longer check our local base against TASK_UNMAPPED_BASE to see if we are running in 32bit emulation mode. The appended patch uses other (hopefully the right) means. Tested on x86_64 in 32 and 64 mode (64 bit fragments as desired, 32 bit collapses as desired). Signed-off-by: Wolfgang Wander <wwc@rentec.com> [-- Attachment #2: avoiding-mmap-fragmentation-revert-unneeded-64-bit-changes-vs-x86_64-task_size-fixes-for-compatibility-mode-processes-fix.patch --] [-- Type: text/x-patch, Size: 511 bytes --] --- arch/x86_64/kernel/sys_x86_64.c~ 2005-06-07 09:12:31.000000000 -0400 +++ arch/x86_64/kernel/sys_x86_64.c 2005-06-07 10:32:07.000000000 -0400 @@ -105,7 +105,8 @@ arch_get_unmapped_area(struct file *filp (!vma || addr + len <= vma->vm_start)) return addr; } - if (begin != TASK_UNMAPPED_BASE && len <= mm->cached_hole_size) { + if (((flags & MAP_32BIT) || test_thread_flag(TIF_IA32)) + && len <= mm->cached_hole_size) { mm->cached_hole_size = 0; mm->free_area_cache = begin; } ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton 2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander @ 2005-06-07 14:48 ` Brice Goglin 2005-06-07 23:15 ` 2.6.12-rc6-mm1 Francois Romieu ` (5 subsequent siblings) 7 siblings, 0 replies; 72+ messages in thread From: Brice Goglin @ 2005-06-07 14:48 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Andrew Morton a écrit : > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/ > > - Added v9fs > > - Various random fixes > > - Probably a similar number of breakages Hi Andrew, I didn't see any breakage. But I get these two lines during boot: yenta 0000:02:03.1: no resource of type 100 available, trying to continue... yenta 0000:02:03.1: no resource of type 100 available, trying to continue... Anyway, my PCMCIA slots seem to still work. Brice ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton 2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander 2005-06-07 14:48 ` 2.6.12-rc6-mm1 Brice Goglin @ 2005-06-07 23:15 ` Francois Romieu 2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott ` (4 subsequent siblings) 7 siblings, 0 replies; 72+ messages in thread From: Francois Romieu @ 2005-06-07 23:15 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel Andrew Morton <akpm@osdl.org> : [...] > -git-netdev-r8169.patch > > Too many rejects from this one. How did you generate git-netdev-r8169.patch ? Jeff's 'upstream-2.6.13' includes all the pending r8169 changes and nothing will be merged before 2.6.12 is out. Imho you can safely ignore any r8169 change until 2.6.12 appears. -- Ueimor ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton ` (2 preceding siblings ...) 2005-06-07 23:15 ` 2.6.12-rc6-mm1 Francois Romieu @ 2005-06-08 1:59 ` Søren Lott 2005-06-08 5:53 ` 2.6.12-rc6-mm1 Jean Delvare 2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft ` (3 subsequent siblings) 7 siblings, 1 reply; 72+ messages in thread From: Søren Lott @ 2005-06-08 1:59 UTC (permalink / raw) To: Andrew Morton, gregkh; +Cc: linux-kernel On Tuesday 07 June 2005 08:29, Andrew Morton wrote: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2. >6.12-rc6-mm1/ [snip] > +gregkh-i2c-i2c-Kconfig-update.patch > +gregkh-i2c-i2c-pcf8574-cleanup.patch > +gregkh-i2c-i2c-adm9240-docs.patch > +gregkh-i2c-i2c-device-attr-lm90.patch > +gregkh-i2c-i2c-device-attr-lm83.patch > +gregkh-i2c-i2c-device-attr-lm63.patch > +gregkh-i2c-i2c-device-attr-it87.patch > +gregkh-i2c-hwmon-01.patch > +gregkh-i2c-hwmon-02.patch > +gregkh-i2c-hwmon-03.patch > > i2c tree updates > > +i2c-chips-need-hwmon.patch > +gregkh-i2c-hwmon-02-sparc64-fix.patch > > Fix a few things in the i2c tree [snip] after those changes i don't get entries in /sys for my W83627THF chip. (p4c800-D, i875,ICH5) relevant config parts: CONFIG_HWMON=y CONFIG_I2C=y CONFIG_I2C_ISA=y CONFIG_I2C_SENSOR=y CONFIG_SENSORS_W83627HF=y thanks. -SL ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott @ 2005-06-08 5:53 ` Jean Delvare 2005-06-08 7:08 ` 2.6.12-rc6-mm1 Søren Lott 0 siblings, 1 reply; 72+ messages in thread From: Jean Delvare @ 2005-06-08 5:53 UTC (permalink / raw) To: Søren Lott; +Cc: Andrew Morton, Greg KH, LKML, LM Sensors Hi Soren, > [snip] > > > +gregkh-i2c-i2c-Kconfig-update.patch > > +gregkh-i2c-i2c-pcf8574-cleanup.patch > > +gregkh-i2c-i2c-adm9240-docs.patch > > +gregkh-i2c-i2c-device-attr-lm90.patch > > +gregkh-i2c-i2c-device-attr-lm83.patch > > +gregkh-i2c-i2c-device-attr-lm63.patch > > +gregkh-i2c-i2c-device-attr-it87.patch > > +gregkh-i2c-hwmon-01.patch > > +gregkh-i2c-hwmon-02.patch > > +gregkh-i2c-hwmon-03.patch > > > > i2c tree updates > > > > +i2c-chips-need-hwmon.patch > > +gregkh-i2c-hwmon-02-sparc64-fix.patch > > > > Fix a few things in the i2c tree > > [snip] > > after those changes i don't get entries in /sys for my W83627THF chip. > > (p4c800-D, i875,ICH5) > > relevant config parts: > > CONFIG_HWMON=y > CONFIG_I2C=y > CONFIG_I2C_ISA=y > CONFIG_I2C_SENSOR=y > CONFIG_SENSORS_W83627HF=y Which kernel are you upgrading from? Is CONFIG_PNPACPI set? If it is, try whithout it. If it doesn't work, please try reverting (in reverse order): gregkh-i2c-hwmon-01.patch gregkh-i2c-hwmon-02.patch gregkh-i2c-hwmon-03.patch i2c-chips-need-hwmon.patch gregkh-i2c-hwmon-02-sparc64-fix.patch and see how it goes. Thanks, -- Jean Delvare ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 5:53 ` 2.6.12-rc6-mm1 Jean Delvare @ 2005-06-08 7:08 ` Søren Lott 0 siblings, 0 replies; 72+ messages in thread From: Søren Lott @ 2005-06-08 7:08 UTC (permalink / raw) To: Jean Delvare; +Cc: Andrew Morton, Greg KH, LKML, LM Sensors On Wednesday 08 June 2005 02:53, Jean Delvare wrote: > Hi Soren, Hi, > Which kernel are you upgrading from? from 2.6.12-rc5-mm2 > Is CONFIG_PNPACPI set? If it is, try whithout it. nope, don't even have CONFIG_PNP set. > If it doesn't work, please try reverting (in reverse order): > gregkh-i2c-hwmon-01.patch > gregkh-i2c-hwmon-02.patch > gregkh-i2c-hwmon-03.patch > i2c-chips-need-hwmon.patch > gregkh-i2c-hwmon-02-sparc64-fix.patch > and see how it goes. yeap, reverting these did the trick, all i2c entries in sysfs are back. :) > Thanks, thanks alot. cheers. -SL ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton ` (3 preceding siblings ...) 2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott @ 2005-06-08 14:22 ` Andy Whitcroft 2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin 2005-06-11 11:51 ` 2.6.12-rc6-mm1 Benoit Boissinot ` (2 subsequent siblings) 7 siblings, 2 replies; 72+ messages in thread From: Andy Whitcroft @ 2005-06-08 14:22 UTC (permalink / raw) To: Andrew Morton, Andrey Panin; +Cc: linux-kernel We've been seeing an early boot hang on IBM x-series (at least on an x440) with -rc6-mm1. Finally got hold of a box to go search for this and it seems that backing out the three patches below fixes it. 515 dmi-move-acpi-boot-quirk.patch 516 dmi-move-acpi-sleep-quirk.patch 517 dmi-remove-central-blacklist.patch I am pretty sure it is actually the first one (thats where my bisection search pointed) but I had to drop the other two to back it out. Anyhow, 2.6.12-rc6-mm1 boots on an x440 with these backed out. Cheers. -apw ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft @ 2005-06-08 20:01 ` Andrew Morton 2005-06-08 23:14 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin 1 sibling, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-06-08 20:01 UTC (permalink / raw) To: Andy Whitcroft; +Cc: pazke, linux-kernel Andy Whitcroft <apw@shadowen.org> wrote: > > We've been seeing an early boot hang on IBM x-series (at least on an > x440) with -rc6-mm1. Finally got hold of a box to go search for this > and it seems that backing out the three patches below fixes it. > > 515 dmi-move-acpi-boot-quirk.patch > 516 dmi-move-acpi-sleep-quirk.patch > 517 dmi-remove-central-blacklist.patch Thanks for taking the time to do that - it helps enormously. The patches aren't terribly important - I'll drop them if nobody sees the problem. It might be an incorrect __init/__initdata/etc marking. But that wouldn't cause an "early" boot hang... ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-08 23:14 ` Martin J. Bligh 2005-06-08 23:22 ` 2.6.12-rc6-mm1 Andrew Morton 0 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-08 23:14 UTC (permalink / raw) To: Andrew Morton, Andy Whitcroft; +Cc: pazke, linux-kernel --On Wednesday, June 08, 2005 13:01:17 -0700 Andrew Morton <akpm@osdl.org> wrote: > Andy Whitcroft <apw@shadowen.org> wrote: >> >> We've been seeing an early boot hang on IBM x-series (at least on an >> x440) with -rc6-mm1. Finally got hold of a box to go search for this >> and it seems that backing out the three patches below fixes it. >> >> 515 dmi-move-acpi-boot-quirk.patch >> 516 dmi-move-acpi-sleep-quirk.patch >> 517 dmi-remove-central-blacklist.patch > > Thanks for taking the time to do that - it helps enormously. > > The patches aren't terribly important - I'll drop them if nobody sees the > problem. It might be an incorrect __init/__initdata/etc marking. But that > wouldn't cause an "early" boot hang... That does indeed make it boot. However ... once it's booted it seems to hit another problem, a hang condition ;-( I suspect it's unrelated. The box is still up and responsive, but cp spins. I'm still chasing the other boot/hang double problem (amd64), so can't really look at this right now, but if anyone has any bright ideas they want me to try, or wants more info, let me know (machine is still hung in that state). Some snippets: ps -ef: root 10980 10979 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/run test kernbench 32 5 - m 2^M root 11060 10980 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/getsysinfo before /usr/lo cal/autobench/logs/k^M root 11219 11060 0 09:02 ? 00:00:00 /bin/bash /usr/local/autobench/scripts/archive_dir /proc/scsi /u sr/local/autobench/l^M root 11221 11219 99 09:02 ? 04:13:26 cp -r /proc/scsi/aic7xxx /proc/scsi/device_info /proc/scsi/scsi /usr/local/autobench^M alt+sysrq+t ^M^@getsysinfo S CB5260CC 0 11060 10980 11219 (NOTLB) ^M^@d5fc1f40 00000082 fffffe00 cb5260cc 00000000 c011259b 2691b900 003d08e4 ^M^@ 080fa558 00000001 d5fc1f38 c04715c0 c0473080 bfcb43b8 d740e000 cb526020 ^M^@ 00000001 cb526020 00000007 d5fc1fbc 0008b824 26cec200 003d08e4 c02fc928 ^M^@Call Trace: ^M^@ [<c011259b>] do_page_fault+0x193/0x60f ^M^@ [<c011d584>] do_wait+0x2a4/0x358 ^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c ^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c ^M^@ [<c011d6c6>] sys_wait4+0x26/0x38 ^M^@ [<c011d6ee>] sys_waitpid+0x16/0x1a ^M^@ [<c0102a19>] syscall_call+0x7/0xb ^M^@archive_dir S CBB810CC 0 11219 11060 11221 (NOTLB) ^M^@d7793f40 00000082 fffffe00 cbb810cc 00000000 c011259b 28b70a00 003d08e4 ^M^@ 080fa158 00000001 d7793f38 c04715c0 c0473080 bfc51a68 c040e000 cbb81020 ^M^@ 00000001 cbb81020 00000007 d7793fbc 00000000 28b70a00 003d08e4 c02fc928 ^M^@Call Trace: ^M^@ [<c011259b>] do_page_fault+0x193/0x60f ^M^@ [<c011d584>] do_wait+0x2a4/0x358 ^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c ^M^@ [<c0115ff8>] default_wake_function+0x0/0x1c ^M^@ [<c011d6c6>] sys_wait4+0x26/0x38 ^M^@ [<c011d6ee>] sys_waitpid+0x16/0x1a ^M^@ [<c0102a19>] syscall_call+0x7/0xb ^M^@cp R running 0 11221 11219 (NOTLB) ^M^@sleep S D77A1F68 0 11906 1409 (NOTLB) ^M^@d77a1f58 00000086 0039a67c d77a1f68 bfade9d8 272d8698 b605a700 003d16b7 ^M^@ d5c1e804 d6ecdbac d77a1f50 c04715c0 c0473080 d77a1fbc d6ecd814 d76d3020 ^M^@ 00000282 c0121f31 0039a67c c107d0e0 00000000 b605a700 003d16b7 d77a1f68 ^M^@Call Trace: ^M^@ [<c0121f31>] lock_timer_base+0x19/0x3c ^M^@ [<c02ef4db>] schedule_timeout+0x7b/0x9c ^M^@ [<c0122904>] process_timeout+0x0/0xc ^M^@ [<c01229fb>] sys_nanosleep+0xdb/0x158 ^M^@ [<c0102a19>] syscall_call+0x7/0xb ^M^@BUG: soft lockup detected on CPU#0! ^M ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efcd9>] CPU: 0 ^M^@EIP is at _spin_unlock_irqrestore+0x5/0x8 ^M^@ EFLAGS: 00000292 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: c03b9b84 EBX: c03b9ad4 ECX: 0a000000 EDX: 00000292 ^M^@ESI: 00000074 EDI: c040ffa4 EBP: d5c16000 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 080f9008 CR3: 16dd0300 CR4: 000006b0 ^M^@ [<c020e729>] __handle_sysrq+0x121/0x128 ^M^@ [<c020e74f>] handle_sysrq+0x1f/0x24 ^M^@ [<c021dda4>] receive_chars+0x16c/0x270 ^M^@ [<c021e0a2>] serial8250_interrupt+0x66/0xe4 ^M^@ [<c01320f0>] handle_IRQ_event+0x28/0x58 ^M^@ [<c0132203>] __do_IRQ+0xe3/0x134 ^M^@ [<c0104b4b>] do_IRQ+0x1b/0x28 ^M^@ [<c01033d6>] common_interrupt+0x1a/0x20 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c01002c8>] rest_init+0x28/0x2c ^M^@ [<c0410899>] start_kernel+0x19d/0x1a0 alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew? doesn't seem to inter-react with the other NMI code well) Command> break ^@SysRq : Show Regs ^M ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 0 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c01002c8>] rest_init+0x28/0x2c ^M^@ [<c0410899>] start_kernel+0x19d/0x1a0 ^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16. ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3. ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17. ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 16 ^M^@EIP is at default_idle+0x23/0x2c ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18. ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19. ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0 ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20. ^M^@ start_secondary+0x13d/0x140 ^M^@Dazed and confused, but trying to continue ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 18 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10. ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29. ^M^@ cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 2 ^M^@ EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23. ^M^@ start_secondary+0x13d/0x140 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7. ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 3 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@Do you have a strange power saving mode enabled? ^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0 ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4. ^M^@ cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 17 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5. ^M^@Dazed and confused, but trying to continue ^M^@EIP is at default_idle+0x23/0x2c ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14. ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9. ^M^@ cpu_idle+0x7b/0x8c ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25. ^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue ^M ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 19 ^M^@ EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13. ^M^@ start_secondary+0x13d/0x140 ^M^@ Do you have a strange power saving mode enabled? ^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11. ^M^@Dazed and confused, but trying to continue ^M^@Dazed and confused, but trying to continue ^M^@Dazed and confused, but trying to continue ^M ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 20 ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26. ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30. ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@ DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21. ^M^@ [<c0100ca3>]Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ cpu_idle+0x7b/0x8c ^M^@Dazed and confused, but trying to continue ^M^@ [<c010e79d>]Do you have a strange power saving mode enabled? ^M^@ start_secondary+0x13d/0x140 ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27. ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24. ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 11221, comm: cp ^M^@EIP: 0060:[<c02efbdc>] CPU: 5 ^M^@Do you have a strange power saving mode enabled? ^M^@EIP is at _spin_lock_irqsave+0x14/0x20 ^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@Dazed and confused, but trying to continue ^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84 ^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0 ^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12. ^M^@ ahc_linux_proc_info+0x27/0x212 ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@ [<c0149052>]Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ page_add_anon_rmap+0x62/0x68 ^M^@ [<c0144358>]Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28. ^M^@Dazed and confused, but trying to continue ^M^@ do_anonymous_page+0x1f0/0x21c ^M^@ [<c0144370>]Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ do_anonymous_page+0x208/0x21c ^M^@Dazed and confused, but trying to continue ^M^@ [<c01443d9>]Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@ do_no_page+0x55/0x3e8 ^M^@ [<c01372b5>] prep_new_page+0x49/0x50 ^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0 ^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8 ^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44 ^M^@ [<c0182f28>] proc_file_read+0xec/0x200 ^M^@ [<c0152ff9>] vfs_read+0x91/0x12c ^M^@ [<c01532e4>] sys_read+0x40/0x6c ^M^@ [<c0102a19>] syscall_call+0x7/0xb ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 7 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 4 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efcae>] CPU: 30 ^M^@EIP is at _spin_lock+0xa/0x10 ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003 ^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0 ^M^@ [<c011583b>] load_balance+0xcf/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efb6a>] CPU: 15 ^M^@EIP is at _spin_trylock+0x6/0x14 ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0 ^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48 ^M^@ [<c01157e4>] load_balance+0x78/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 21 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 14 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 27 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 8 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 25 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efb6a>] CPU: 29 ^M^@EIP is at _spin_trylock+0x6/0x14 ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0 ^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48 ^M^@ [<c01157e4>] load_balance+0x78/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 31 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 24 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 10 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 26 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c01154a3>] CPU: 13 ^M^@EIP is at find_busiest_group+0x103/0x2f8 ^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000 ^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0 ^M^@ [<c01157a2>] load_balance+0x36/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 28 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c011e897>] CPU: 12 ^M^@EIP is at __do_softirq+0x47/0x100 ^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0 ^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c011e97f>] do_softirq+0x2f/0x34 ^M^@ [<c011ea24>] irq_exit+0x34/0x38 ^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 9 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 11 ^M^@ EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 23 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 6 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 22 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 1 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ^M ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 23:14 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-08 23:22 ` Andrew Morton 2005-06-08 23:34 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-06-08 23:22 UTC (permalink / raw) To: Martin J. Bligh; +Cc: apw, pazke, linux-kernel "Martin J. Bligh" <mbligh@mbligh.org> wrote: > > alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew? > doesn't seem to inter-react with the other NMI code well) What patch? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 23:22 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-08 23:34 ` Martin J. Bligh 2005-06-09 7:17 ` 2.6.12-rc6-mm1 Kirill Korotaev 0 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-08 23:34 UTC (permalink / raw) To: Andrew Morton; +Cc: apw, pazke, linux-kernel, dev --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <akpm@osdl.org> wrote: > "Martin J. Bligh" <mbligh@mbligh.org> wrote: >> >> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew? >> doesn't seem to inter-react with the other NMI code well) > > What patch? Sorry. nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch It does seem to work. But probably needs some cleanup for the NMI errors. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 23:34 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-09 7:17 ` Kirill Korotaev 2005-06-09 13:38 ` 2.6.12-rc6-mm1 Martin J. Bligh 0 siblings, 1 reply; 72+ messages in thread From: Kirill Korotaev @ 2005-06-09 7:17 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Andrew Morton, apw, pazke, linux-kernel > --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <akpm@osdl.org> wrote: > > >>"Martin J. Bligh" <mbligh@mbligh.org> wrote: >> >>>alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew? >>> doesn't seem to inter-react with the other NMI code well) >> >>What patch? > > > Sorry. > > nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch > > It does seem to work. But probably needs some cleanup for the NMI > errors. If you give me to know where the problem come from I can fix it and make a cleanup. Kirill ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-09 7:17 ` 2.6.12-rc6-mm1 Kirill Korotaev @ 2005-06-09 13:38 ` Martin J. Bligh 2005-06-10 12:12 ` 2.6.12-rc6-mm1 Kirill Korotaev 0 siblings, 1 reply; 72+ messages in thread From: Martin J. Bligh @ 2005-06-09 13:38 UTC (permalink / raw) To: Kirill Korotaev; +Cc: Andrew Morton, apw, pazke, linux-kernel --Kirill Korotaev <dev@sw.ru> wrote (on Thursday, June 09, 2005 11:17:43 +0400): >> --On Wednesday, June 08, 2005 16:22:47 -0700 Andrew Morton <akpm@osdl.org> wrote: >> >> >>> "Martin J. Bligh" <mbligh@mbligh.org> wrote: >>> >>>> alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew? >>>> doesn't seem to inter-react with the other NMI code well) >>> >>> What patch? >> >> >> Sorry. >> >> nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch >> >> It does seem to work. But probably needs some cleanup for the NMI >> errors. > If you give me to know where the problem come from I can fix it and make a cleanup. It gets a lot of the "dazed and confused" errors. Possibly you just need to disable that part of the handler? Command> break ^@SysRq : Show Regs ^M ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 0 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c01002c8>] rest_init+0x28/0x2c ^M^@ [<c0410899>] start_kernel+0x19d/0x1a0 ^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16. ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3. ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17. ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 16 ^M^@EIP is at default_idle+0x23/0x2c ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18. ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19. ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0 ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20. ^M^@ start_secondary+0x13d/0x140 ^M^@Dazed and confused, but trying to continue ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 18 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10. ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29. ^M^@ cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 2 ^M^@ EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23. ^M^@ start_secondary+0x13d/0x140 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7. ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 3 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@Do you have a strange power saving mode enabled? ^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0 ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4. ^M^@ cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 17 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5. ^M^@Dazed and confused, but trying to continue ^M^@EIP is at default_idle+0x23/0x2c ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14. ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9. ^M^@ cpu_idle+0x7b/0x8c ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25. ^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue ^M ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 19 ^M^@ EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13. ^M^@ start_secondary+0x13d/0x140 ^M^@ Do you have a strange power saving mode enabled? ^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11. ^M^@Dazed and confused, but trying to continue ^M^@Dazed and confused, but trying to continue ^M^@Dazed and confused, but trying to continue ^M ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 20 ^M^@Dazed and confused, but trying to continue ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26. ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30. ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@ DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21. ^M^@ [<c0100ca3>]Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ cpu_idle+0x7b/0x8c ^M^@Dazed and confused, but trying to continue ^M^@ [<c010e79d>]Do you have a strange power saving mode enabled? ^M^@ start_secondary+0x13d/0x140 ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27. ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24. ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 11221, comm: cp ^M^@EIP: 0060:[<c02efbdc>] CPU: 5 ^M^@Do you have a strange power saving mode enabled? ^M^@EIP is at _spin_lock_irqsave+0x14/0x20 ^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@Dazed and confused, but trying to continue ^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84 ^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0 ^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12. ^M^@ ahc_linux_proc_info+0x27/0x212 ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@ [<c0149052>]Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ page_add_anon_rmap+0x62/0x68 ^M^@ [<c0144358>]Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15. ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28. ^M^@Dazed and confused, but trying to continue ^M^@ do_anonymous_page+0x1f0/0x21c ^M^@ [<c0144370>]Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@ do_anonymous_page+0x208/0x21c ^M^@Dazed and confused, but trying to continue ^M^@ [<c01443d9>]Do you have a strange power saving mode enabled? ^M^@Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@Do you have a strange power saving mode enabled? ^M^@ do_no_page+0x55/0x3e8 ^M^@ [<c01372b5>] prep_new_page+0x49/0x50 ^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0 ^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8 ^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44 ^M^@ [<c0182f28>] proc_file_read+0xec/0x200 ^M^@ [<c0152ff9>] vfs_read+0x91/0x12c ^M^@ [<c01532e4>] sys_read+0x40/0x6c ^M^@ [<c0102a19>] syscall_call+0x7/0xb ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 7 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 4 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efcae>] CPU: 30 ^M^@EIP is at _spin_lock+0xa/0x10 ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003 ^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0 ^M^@ [<c011583b>] load_balance+0xcf/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efb6a>] CPU: 15 ^M^@EIP is at _spin_trylock+0x6/0x14 ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0 ^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48 ^M^@ [<c01157e4>] load_balance+0x78/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 21 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 14 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 27 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 8 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 25 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c02efb6a>] CPU: 29 ^M^@EIP is at _spin_trylock+0x6/0x14 ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0 ^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48 ^M^@ [<c01157e4>] load_balance+0x78/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 31 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 24 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 10 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 26 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c01154a3>] CPU: 13 ^M^@EIP is at find_busiest_group+0x103/0x2f8 ^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000 ^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0 ^M^@ [<c01157a2>] load_balance+0x36/0x170 ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 ^M^@ [<c01225b3>] update_process_times+0xef/0x100 ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 28 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c011e897>] CPU: 12 ^M^@EIP is at __do_softirq+0x47/0x100 ^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0 ^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c011e97f>] do_softirq+0x2f/0x34 ^M^@ [<c011ea24>] irq_exit+0x34/0x38 ^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4 ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 ^M^@ [<c0100bb0>] default_idle+0x0/0x2c ^M^@ [<c0100bd3>] default_idle+0x23/0x2c ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 9 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 11 ^M^@ EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 23 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 6 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 22 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ Dazed and confused, but trying to continue ^M^@Do you have a strange power saving mode enabled? ^M^@----------- IPI show regs ----------- ^M^@Pid: 0, comm: swapper ^M^@EIP: 0060:[<c0100bd3>] CPU: 1 ^M^@EIP is at default_idle+0x23/0x2c ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 ^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b ^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0 ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 ^M^@ ^M ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-09 13:38 ` 2.6.12-rc6-mm1 Martin J. Bligh @ 2005-06-10 12:12 ` Kirill Korotaev 0 siblings, 0 replies; 72+ messages in thread From: Kirill Korotaev @ 2005-06-10 12:12 UTC (permalink / raw) To: Martin J. Bligh; +Cc: Andrew Morton, apw, pazke, linux-kernel [-- Attachment #1: Type: text/plain, Size: 24361 bytes --] >>>>>alt+sysrq+p does wierd stuff (is that new patch in your tree Andrew? >>>>>doesn't seem to inter-react with the other NMI code well) >>>> >>>>What patch? >>> >>> >>>Sorry. >>> >>>nmi-lockup-and-altsysrq-p-dumping-calltraces-on-_all_-cpus.patch >>> >>>It does seem to work. But probably needs some cleanup for the NMI >>>errors. >> >>If you give me to know where the problem come from I can fix it and make a cleanup. > > > It gets a lot of the "dazed and confused" errors. Possibly you just need > to disable that part of the handler? can you try this cleanup patch? This fixes the problem for me, though I do no like the way it does so very much... Kirill > Command> break > ^@SysRq : Show Regs > ^M > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 0 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: c040e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7e3f5a0 CR3: 16dd0300 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c01002c8>] rest_init+0x28/0x2c > ^M^@ [<c0410899>] start_kernel+0x19d/0x1a0 > ^M^@ Uhhuh. NMI received for unknown reason 00 on CPU 1. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 16. > ^M^@Dazed and confused, but trying to continue > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 3. > ^M^@Do you have a strange power saving mode enabled? > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 17. > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 16 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@Dazed and confused, but trying to continue > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 2. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 18. > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 19. > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7420000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@CR0: 8005003b CR2: 00000000 CR3: 17771800 CR4: 000006b0 > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 6. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 20. > ^M^@ start_secondary+0x13d/0x140 > ^M^@Dazed and confused, but trying to continue > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 18 > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 10. > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7426000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f25d9c CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 29. > ^M^@ cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 2 > ^M^@ EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7400000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7edeb00 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 23. > ^M^@ start_secondary+0x13d/0x140 > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 7. > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 3 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7402000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@Do you have a strange power saving mode enabled? > ^M^@CR0: 8005003b CR2: b7f95438 CR3: 17771800 CR4: 000006b0 > ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 4. > ^M^@ cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 17 > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 5. > ^M^@Dazed and confused, but trying to continue > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 14. > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7424000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>]Uhhuh. NMI received for unknown reason 00 on CPU 9. > ^M^@ cpu_idle+0x7b/0x8c > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 25. > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140Dazed and confused, but trying to continue > ^M > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 19 > ^M^@ EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7428000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f30d9c CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>]Uhhuh. NMI received for unknown reason 00 on CPU 13. > ^M^@ start_secondary+0x13d/0x140 > ^M^@ Do you have a strange power saving mode enabled? > ^M^@----------- IPI show regs -----------Uhhuh. NMI received for unknown reason 00 on CPU 8. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 11. > ^M^@Dazed and confused, but trying to continue > ^M^@Dazed and confused, but trying to continue > ^M^@Dazed and confused, but trying to continue > ^M > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 20 > ^M^@Dazed and confused, but trying to continue > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 22. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 26. > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@ESI: d742a000 EDI: c0470300 EBP: c0470300Uhhuh. NMI received for unknown reason 00 on CPU 30. > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Do you have a strange power saving mode enabled? > ^M^@Do you have a strange power saving mode enabled? > ^M^@ DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 21. > ^M^@ [<c0100ca3>]Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@ cpu_idle+0x7b/0x8c > ^M^@Dazed and confused, but trying to continue > ^M^@ [<c010e79d>]Do you have a strange power saving mode enabled? > ^M^@ start_secondary+0x13d/0x140 > ^M^@Do you have a strange power saving mode enabled? > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 27. > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 24. > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 11221, comm: cp > ^M^@EIP: 0060:[<c02efbdc>] CPU: 5 > ^M^@Do you have a strange power saving mode enabled? > ^M^@EIP is at _spin_lock_irqsave+0x14/0x20 > ^M^@ EFLAGS: 00000286 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@Dazed and confused, but trying to continue > ^M^@EAX: 00000286 EBX: d6ce4800 ECX: c03cabe0 EDX: c049ba84 > ^M^@ESI: ffffffea EDI: d55f8000 EBP: d55f8000 DS: 007b ES: 007b > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@CR0: 80050033 CR2: bfc7d2fc CR3: 16dd02e0 CR4: 000006b0 > ^M^@ [<c0270377>]Uhhuh. NMI received for unknown reason 00 on CPU 31. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 12. > ^M^@ ahc_linux_proc_info+0x27/0x212 > ^M^@Do you have a strange power saving mode enabled? > ^M^@Do you have a strange power saving mode enabled? > ^M^@ [<c0149052>]Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@ page_add_anon_rmap+0x62/0x68 > ^M^@ [<c0144358>]Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 15. > ^M^@Uhhuh. NMI received for unknown reason 00 on CPU 28. > ^M^@Dazed and confused, but trying to continue > ^M^@ do_anonymous_page+0x1f0/0x21c > ^M^@ [<c0144370>]Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@ do_anonymous_page+0x208/0x21c > ^M^@Dazed and confused, but trying to continue > ^M^@ [<c01443d9>]Do you have a strange power saving mode enabled? > ^M^@Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@Do you have a strange power saving mode enabled? > ^M^@ do_no_page+0x55/0x3e8 > ^M^@ [<c01372b5>] prep_new_page+0x49/0x50 > ^M^@ [<c0137973>] buffered_rmqueue+0x16f/0x1d0 > ^M^@ [<c0137e1b>] __alloc_pages+0x3bb/0x3c8 > ^M^@ [<c0257cdb>] proc_scsi_read+0x2b/0x44 > ^M^@ [<c0182f28>] proc_file_read+0xec/0x200 > ^M^@ [<c0152ff9>] vfs_read+0x91/0x12c > ^M^@ [<c01532e4>] sys_read+0x40/0x6c > ^M^@ [<c0102a19>] syscall_call+0x7/0xb > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 7 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d740c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 4 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7404000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 080f9c48 CR3: 17771320 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c02efcae>] CPU: 30 > ^M^@EIP is at _spin_lock+0xa/0x10 > ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: c1050aa0 EBX: c1050aa0 ECX: d7463ea8 EDX: 00000003 > ^M^@ESI: c10d9620 EDI: c10d9fe0 EBP: d7463eb0 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7eea900 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c011583b>] load_balance+0xcf/0x170 > ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 > ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 > ^M^@ [<c01225b3>] update_process_times+0xef/0x100 > ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 > ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 > ^M^@ [<c0100bb0>] default_idle+0x0/0x2c > ^M^@ [<c0100bd3>] default_idle+0x23/0x2c > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c02efb6a>] CPU: 15 > ^M^@EIP is at _spin_trylock+0x6/0x14 > ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0 > ^M^@ESI: c10875a0 EDI: c1087f60 EBP: d741fe84 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48 > ^M^@ [<c01157e4>] load_balance+0x78/0x170 > ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 > ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 > ^M^@ [<c01225b3>] update_process_times+0xef/0x100 > ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 > ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 > ^M^@ [<c0100bb0>] default_idle+0x0/0x2c > ^M^@ [<c0100bd3>] default_idle+0x23/0x2c > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 21 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d742c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 14 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d741c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7e64070 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 27 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d745c000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f66d9c CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 8 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d740e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 080f133c CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 25 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7436000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f74d9c CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c02efb6a>] CPU: 29 > ^M^@EIP is at _spin_trylock+0x6/0x14 > ^M^@ EFLAGS: 00000046 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000001 EBX: c1050aa0 ECX: 00000008 EDX: c1050aa0 > ^M^@ESI: c10d3ea0 EDI: c10d4860 EBP: d7461e84 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0114fda>] double_lock_balance+0x12/0x48 > ^M^@ [<c01157e4>] load_balance+0x78/0x170 > ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 > ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 > ^M^@ [<c01225b3>] update_process_times+0xef/0x100 > ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 > ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 > ^M^@ [<c0100bb0>] default_idle+0x0/0x2c > ^M^@ [<c0100bd3>] default_idle+0x23/0x2c > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 31 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7464000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 24 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7434000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 10 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7412000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7ea6920 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 26 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7438000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c01154a3>] CPU: 13 > ^M^@EIP is at find_busiest_group+0x103/0x2f8 > ^M^@ EFLAGS: 00000086 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000005 EBX: 00000005 ECX: c1050aa0 EDX: 00000000 > ^M^@ESI: c04813ac EDI: 00000200 EBP: d741be7c DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7e7e070 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c01157a2>] load_balance+0x36/0x170 > ^M^@ [<c0115af5>] rebalance_tick+0xe1/0x104 > ^M^@ [<c0115d77>] scheduler_tick+0x97/0x318 > ^M^@ [<c01225b3>] update_process_times+0xef/0x100 > ^M^@ [<c010f5f9>] smp_apic_timer_interrupt+0xd5/0xe4 > ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 > ^M^@ [<c0100bb0>] default_idle+0x0/0x2c > ^M^@ [<c0100bd3>] default_idle+0x23/0x2c > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 28 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d745e000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c011e897>] CPU: 12 > ^M^@EIP is at __do_softirq+0x47/0x100 > ^M^@ EFLAGS: 00000006 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: c0470380 EBX: c0476020 ECX: 00000030 EDX: c1075ce0 > ^M^@ESI: 00000002 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f54000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c011e97f>] do_softirq+0x2f/0x34 > ^M^@ [<c011ea24>] irq_exit+0x34/0x38 > ^M^@ [<c010f601>] smp_apic_timer_interrupt+0xdd/0xe4 > ^M^@ [<c0103464>] apic_timer_interrupt+0x1c/0x24 > ^M^@ [<c0100bb0>] default_idle+0x0/0x2c > ^M^@ [<c0100bd3>] default_idle+0x23/0x2c > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 9 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7410000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f1d900 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 11 > ^M^@ EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7414000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 23 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7432000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 6 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7408000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7f3cdd8 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 22 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: d7430000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: 00000000 CR3: 00474000 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ Dazed and confused, but trying to continue > ^M^@Do you have a strange power saving mode enabled? > ^M^@----------- IPI show regs ----------- > ^M^@Pid: 0, comm: swapper > ^M^@EIP: 0060:[<c0100bd3>] CPU: 1 > ^M^@EIP is at default_idle+0x23/0x2c > ^M^@ EFLAGS: 00000246 Not tainted (2.6.12-rc6-mm1-autokern1) > ^M^@EAX: 00000000 EBX: c0476020 ECX: c0100bb0 EDX: 003a2f43 > ^M^@ESI: c13fc000 EDI: c0470300 EBP: c0470300 DS: 007b ES: 007b > ^M^@CR0: 8005003b CR2: b7ee1d9c CR3: 17771640 CR4: 000006b0 > ^M^@ [<c0100ca3>] cpu_idle+0x7b/0x8c > ^M^@ [<c010e79d>] start_secondary+0x13d/0x140 > ^M^@ ^M > > > > > > [-- Attachment #2: altsysrq-p-cleanup --] [-- Type: text/plain, Size: 1080 bytes --] --- ./arch/i386/kernel/traps.c.xxx 2005-05-10 18:27:04.000000000 +0400 +++ ./arch/i386/kernel/traps.c 2005-06-10 14:18:32.000000000 +0400 @@ -574,6 +574,14 @@ void die_nmi (struct pt_regs *regs, cons do_exit(SIGSEGV); } +static int dummy_nmi_callback(struct pt_regs * regs, int cpu) +{ + return 0; +} + +static nmi_callback_t nmi_callback = dummy_nmi_callback; +static nmi_callback_t nmi_ipi_callback = dummy_nmi_callback; + static void default_do_nmi(struct pt_regs * regs) { unsigned char reason = 0; @@ -596,6 +604,9 @@ static void default_do_nmi(struct pt_reg return; } #endif + if (nmi_ipi_callback != dummy_nmi_callback) + return; + unknown_nmi_error(reason, regs); return; } @@ -612,14 +623,6 @@ static void default_do_nmi(struct pt_reg reassert_nmi(); } -static int dummy_nmi_callback(struct pt_regs * regs, int cpu) -{ - return 0; -} - -static nmi_callback_t nmi_callback = dummy_nmi_callback; -static nmi_callback_t nmi_ipi_callback = dummy_nmi_callback; - fastcall void do_nmi(struct pt_regs * regs, long error_code) { int cpu; ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft 2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-09 4:27 ` Andrey Panin 2005-06-09 13:12 ` 2.6.12-rc6-mm1 Andy Whitcroft 1 sibling, 1 reply; 72+ messages in thread From: Andrey Panin @ 2005-06-09 4:27 UTC (permalink / raw) To: Andy Whitcroft; +Cc: Andrew Morton, linux-kernel [-- Attachment #1.1: Type: text/plain, Size: 794 bytes --] On 159, 06 08, 2005 at 03:22:57 +0100, Andy Whitcroft wrote: > We've been seeing an early boot hang on IBM x-series (at least on an > x440) with -rc6-mm1. Finally got hold of a box to go search for this > and it seems that backing out the three patches below fixes it. > > 515 dmi-move-acpi-boot-quirk.patch > 516 dmi-move-acpi-sleep-quirk.patch > 517 dmi-remove-central-blacklist.patch > > I am pretty sure it is actually the first one (thats where my bisection > search pointed) but I had to drop the other two to back it out. Anyhow, > 2.6.12-rc6-mm1 boots on an x440 with these backed out. Yeah, probably brown paper bag time... Please try the attached patch. -- Andrey Panin | Linux and UNIX system administrator pazke@donpac.ru | PGP key: wwwkeys.pgp.net [-- Attachment #1.2: patch-stupid-dmi-bug --] [-- Type: text/plain, Size: 978 bytes --] diff -urdpNX /usr/share/dontdiff linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/boot.c linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/boot.c --- linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/boot.c 2005-06-09 08:02:06.000000000 +0400 +++ linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/boot.c 2005-06-09 08:24:01.000000000 +0400 @@ -1040,6 +1040,7 @@ static struct dmi_system_id __initdata a }, }, #endif + { } }; #endif /* __i386__ */ diff -urdpNX /usr/share/dontdiff linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/sleep.c linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/sleep.c --- linux-2.6.12-rc6-mm1.vanilla/arch/i386/kernel/acpi/sleep.c 2005-06-09 08:02:06.000000000 +0400 +++ linux-2.6.12-rc6-mm1/arch/i386/kernel/acpi/sleep.c 2005-06-09 08:24:15.000000000 +0400 @@ -108,6 +108,7 @@ static __initdata struct dmi_system_id a DMI_MATCH(DMI_PRODUCT_NAME, "S4030CDT/4.3"), }, }, + { } }; static int __init acpisleep_dmi_init(void) [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin @ 2005-06-09 13:12 ` Andy Whitcroft 0 siblings, 0 replies; 72+ messages in thread From: Andy Whitcroft @ 2005-06-09 13:12 UTC (permalink / raw) To: Andrey Panin; +Cc: Andrew Morton, linux-kernel, Martin J. Bligh Andrey Panin wrote: > Yeah, probably brown paper bag time... Please try the attached patch. Ok. I can confirm that linux-2.6.12-rc6-mm1 + just this fix boots fine and works. And yes I said works? I can't understand why backing the others out left us with the odd spin hang and this combination doesn't. I've managed to run 4 sets of boot and kernbench (10 runs) without a hang. /me feels there is something else ugly in here we don't want but unrelated to this patch. -apw ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton ` (4 preceding siblings ...) 2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft @ 2005-06-11 11:51 ` Benoit Boissinot 2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall 7 siblings, 0 replies; 72+ messages in thread From: Benoit Boissinot @ 2005-06-11 11:51 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On 6/7/05, Andrew Morton <akpm@osdl.org> wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2.6.12-rc6-mm1/ > > - Added v9fs > > - Various random fixes > > - Probably a similar number of breakages > I just had the following Oopses: Unable to handle kernel paging request at virtual address 901a1960 printing eip: c0139251 *pde = 00000000 Oops: 0002 [#1] Modules linked in: radeon drm tun snd_seq snd_pcm_oss snd_mixer_oss snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ipt_multiport ipt_state ipt_limit ipt_MASQUERADE ipt_mark iptable_mangle ipt_MARK ipt_REJECT iptable_filter iptable_nat ip_tables ip_conntrack_irc ip_conntrack_ftp ip_conntrack skge 8139too mii usbcore ide_cd cdrom CPU: 0 EIP: 0060:[<c0139251>] Not tainted VLI EFLAGS: 00010086 (2.6.12-rc6-mm1-arakou) EIP is at find_lock_page+0x21/0xb0 eax: 901a195c ebx: 901a195c ecx: d8a3b094 edx: 00000003 esi: 00109380 edi: c18e4b08 ebp: d822cb10 esp: d822cb00 ds: 007b es: 007b ss: 0068 Process emerge (pid: 31977, threadinfo=d822c000 task=cbb9d040) Stack: c18e4b04 c1218060 00000000 00000050 d822cb34 c013930e 00000050 00109380 c18e4b04 c0333d04 00109380 c18e4a00 00001000 d822cb50 c0157986 d822cb50 00109380 00109380 00001000 c18e4a00 d822cb70 c0157af5 00001000 d822cb70 Call Trace: [<c0103d17>] show_stack+0x97/0xd0 [<c0103ec5>] show_registers+0x155/0x1f0 [<c01040c1>] die+0xc1/0x140 [<c01157ec>] do_page_fault+0x23c/0x6b5 [<c010395f>] error_code+0x4f/0x54 [<c013930e>] find_or_create_page+0x2e/0xd0 [<c0157986>] grow_dev_page+0x26/0x110 [<c0157af5>] __getblk_slow+0x85/0x130 [<c0157e8b>] __getblk+0x3b/0x50 [<c01a788b>] search_by_key+0x9b/0xf40 [<c0195095>] reiserfs_read_locked_inode+0x65/0x110 [<c01951e9>] reiserfs_iget+0x79/0xa0 [<c0190330>] reiserfs_lookup+0xd0/0x130 [<c0161f80>] real_lookup+0xb0/0xd0 [<c01622be>] do_lookup+0x7e/0x90 [<c0162a06>] __link_path_walk+0x736/0xd50 [<c016306a>] link_path_walk+0x4a/0x110 [<c01633b4>] path_lookup+0x74/0x120 [<c01635ee>] __user_walk+0x2e/0x50 [<c015e240>] vfs_stat+0x20/0x50 [<c015e834>] sys_stat64+0x14/0x30 [<c0102e0f>] sysenter_past_esp+0x54/0x75 Code: c3 89 f6 8d bc 27 00 00 00 00 55 89 e5 57 56 89 d6 53 83 ec 04 89 45 f0 fa 8d 78 04 89 f2 89 f8 e8 35 04 0b 00 85 c0 89 c3 74 56 <ff> 40 04 0f ba 28 00 19 c0 85 c0 74 49 fb 0f ba 2b 00 19 c0 85 <1>Unable to handle kernel paging request at virtual address 71ef2710 printing eip: c0157140 *pde = 00000000 Oops: 0000 [#2] Modules linked in: radeon drm tun snd_seq snd_pcm_oss snd_mixer_oss snd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore ipt_multiport ipt_state ipt_limit ipt_MASQUERADE ipt_mark iptable_mangle ipt_MARK ipt_REJECT iptable_filter iptable_nat ip_tables ip_conntrack_irc ip_conntrack_ftp ip_conntrack skge 8139too mii usbcore ide_cd cdrom CPU: 0 EIP: 0060:[<c0157140>] Not tainted VLI EFLAGS: 00010a16 (2.6.12-rc6-mm1-arakou) EIP is at __find_get_block_slow+0x90/0x140 eax: 00000000 ebx: 71ef26fc ecx: cb96f0e7 edx: 00000001 esi: c1309f20 edi: 000f9df5 ebp: e35d5b98 esp: e35d5b74 ds: 007b es: 007b ss: 0068 Process vim (pid: 32081, threadinfo=e35d5000 task=c2cc55b0) Stack: df7fb6bc f6de8a4c f7cf12fc f6de8cec 00000002 c18e4584 dcb43d7c c18e4520 000f9df5 e35d5bac c0157e1c 00001000 000f9df5 c18e4520 e35d5bc0 c0157e6c 00003e94 0000003e 000f9df5 e35d5ce0 c01a788b 0000001e 0000001f e35d5bf0 Call Trace: [<c0103d17>] show_stack+0x97/0xd0 [<c0103ec5>] show_registers+0x155/0x1f0 [<c01040c1>] die+0xc1/0x140 [<c01157ec>] do_page_fault+0x23c/0x6b5 [<c010395f>] error_code+0x4f/0x54 [<c0157e1c>] __find_get_block+0x6c/0xa0 [<c0157e6c>] __getblk+0x1c/0x50 [<c01a788b>] search_by_key+0x9b/0xf40 [<c018fc2c>] search_by_entry_key+0x1c/0x1f0 [<c01901e0>] reiserfs_find_entry+0x90/0x110 [<c01902d2>] reiserfs_lookup+0x72/0x130 [<c0161f80>] real_lookup+0xb0/0xd0 [<c01622be>] do_lookup+0x7e/0x90 [<c0162a06>] __link_path_walk+0x736/0xd50 [<c016306a>] link_path_walk+0x4a/0x110 [<c01633b4>] path_lookup+0x74/0x120 [<c0163a09>] open_namei+0x79/0x5f0 [<c0154c29>] filp_open+0x29/0x50 [<c0154fac>] sys_open+0x3c/0xc0 [<c0102e0f>] sysenter_past_esp+0x54/0x75 Code: 89 f0 e8 34 b8 fe ff 89 d8 83 c4 18 5b 5e 5f c9 c3 8b 06 f6 c4 08 0f 84 a4 00 00 00 8b 5e 0c ba 01 00 00 00 89 d9 90 8d 74 26 00 <3b> 7b 14 74 7b 8b 03 8b 5b 04 a8 10 b8 00 00 00 00 0f 44 d0 39 Bad page state at free_hot_cold_page (in process 'firefox-bin', page c1309360) flags:0x40000000 mapping:00000000 mapcount:-1 count:0 Backtrace: [<c0103d67>] dump_stack+0x17/0x20 [<c013cb52>] bad_page+0x72/0xb0 [<c013d2da>] free_hot_cold_page+0x4a/0xe0 [<c013da81>] __pagevec_free+0x31/0x40 [<c0142a9d>] release_pages+0x9d/0x150 [<c0142b68>] __pagevec_release+0x18/0x30 [<c01430bb>] truncate_inode_pages_range+0x13b/0x300 [<c014329a>] truncate_inode_pages+0x1a/0x20 [<c016d8e2>] generic_delete_inode+0xb2/0xd0 [<c016da1f>] generic_drop_inode+0xf/0x20 [<c016da92>] iput+0x62/0x90 [<c016494f>] sys_unlink+0xdf/0x110 [<c0102e0f>] sysenter_past_esp+0x54/0x75 Trying to fix it up, but a reboot is needed regards, Benoit Boissinot ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton ` (5 preceding siblings ...) 2005-06-11 11:51 ` 2.6.12-rc6-mm1 Benoit Boissinot @ 2005-06-18 22:39 ` Richard Purdie 2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King 2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall 7 siblings, 2 replies; 72+ messages in thread From: Richard Purdie @ 2005-06-18 22:39 UTC (permalink / raw) To: Russell King; +Cc: LKML, Andrew Morton On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote: > +git-arm-smp.patch > > ARM git trees The arm pxa255 based Zaurus won't resume from a suspend with the patches from the above tree applied. The suspend looks normal and gets at least as far as pxa_pm_enter(). After that, the device appears to be dead and needs a battery removal to reset. I'm unsure if it actually suspends and is failing to resume or is crashing in the latter suspend stages. Is there some documentation on what the above patch is aiming to do anywhere? Richard ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie @ 2005-06-18 22:44 ` Andrew Morton 2005-06-18 22:57 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King 1 sibling, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-06-18 22:44 UTC (permalink / raw) To: Richard Purdie; +Cc: linux, linux-kernel Richard Purdie <rpurdie@rpsys.net> wrote: > > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote: > > +git-arm-smp.patch > > > > ARM git trees > > The arm pxa255 based Zaurus won't resume from a suspend with the patches > from the above tree applied. The suspend looks normal and gets at least > as far as pxa_pm_enter(). After that, the device appears to be dead and > needs a battery removal to reset. I'm unsure if it actually suspends and > is failing to resume or is crashing in the latter suspend stages. > > Is there some documentation on what the above patch is aiming to do > anywhere? Did you apply just that patch, or are you talking about the whole -mm lineup? If the latter, please test with only git-arm-smp.patch. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-18 22:57 ` Richard Purdie 2005-06-18 23:11 ` 2.6.12-rc6-mm1 Richard Purdie 0 siblings, 1 reply; 72+ messages in thread From: Richard Purdie @ 2005-06-18 22:57 UTC (permalink / raw) To: Andrew Morton; +Cc: linux, linux-kernel On Sat, 2005-06-18 at 15:44 -0700, Andrew Morton wrote: > Richard Purdie <rpurdie@rpsys.net> wrote: > > > > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote: > > > +git-arm-smp.patch > > > > > > ARM git trees > > > > The arm pxa255 based Zaurus won't resume from a suspend with the patches > > from the above tree applied. The suspend looks normal and gets at least > > as far as pxa_pm_enter(). After that, the device appears to be dead and > > needs a battery removal to reset. I'm unsure if it actually suspends and > > is failing to resume or is crashing in the latter suspend stages. > > > > Is there some documentation on what the above patch is aiming to do > > anywhere? > > Did you apply just that patch, or are you talking about the whole -mm lineup? > > If the latter, please test with only git-arm-smp.patch. Sorry, I wasn't clear. I had problems with the -mm lineup and tracked it down to the above patch. With the above patch removed, -mm works fine. (I know there's a number of changes to the arm pxa suspend/resume code in git-arm.patch but they're definitely not causing the problem.) Richard ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-18 22:57 ` 2.6.12-rc6-mm1 Richard Purdie @ 2005-06-18 23:11 ` Richard Purdie 0 siblings, 0 replies; 72+ messages in thread From: Richard Purdie @ 2005-06-18 23:11 UTC (permalink / raw) To: Andrew Morton; +Cc: linux, linux-kernel On Sat, 2005-06-18 at 23:57 +0100, Richard Purdie wrote: > On Sat, 2005-06-18 at 15:44 -0700, Andrew Morton wrote: > > > > +git-arm-smp.patch > > > > ARM git trees > > > > > > The arm pxa255 based Zaurus won't resume from a suspend with the patches > > > from the above tree applied. The suspend looks normal and gets at least > > > as far as pxa_pm_enter(). After that, the device appears to be dead and > > > needs a battery removal to reset. I'm unsure if it actually suspends and > > > is failing to resume or is crashing in the latter suspend stages. > > > > > > Is there some documentation on what the above patch is aiming to do > > > anywhere? > > > > Did you apply just that patch, or are you talking about the whole -mm lineup? > > > > If the latter, please test with only git-arm-smp.patch. > > Sorry, I wasn't clear. I had problems with the -mm lineup and tracked it > down to the above patch. With the above patch removed, -mm works fine. > > (I know there's a number of changes to the arm pxa suspend/resume code > in git-arm.patch but they're definitely not causing the problem.) I meant to add that git-arm-smp.patch breaks suspend/resume, even applied in isolation against 2.6.12-rc6. Richard ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-06-18 23:18 ` Russell King 2005-06-19 1:20 ` 2.6.12-rc6-mm1 Richard Purdie 1 sibling, 1 reply; 72+ messages in thread From: Russell King @ 2005-06-18 23:18 UTC (permalink / raw) To: Richard Purdie; +Cc: LKML, Andrew Morton On Sat, Jun 18, 2005 at 11:39:18PM +0100, Richard Purdie wrote: > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote: > > +git-arm-smp.patch > > > > ARM git trees > > The arm pxa255 based Zaurus won't resume from a suspend with the patches > from the above tree applied. The suspend looks normal and gets at least > as far as pxa_pm_enter(). After that, the device appears to be dead and > needs a battery removal to reset. I'm unsure if it actually suspends and > is failing to resume or is crashing in the latter suspend stages. <grumble>Well, its a bit late for this since (a) stuff has rapidly moved on at rmk towers since 2.6.12 was released this morning, and (b) I've just asked Linus to pull this.</grumble> Thinking about what's probably happening, I suspect all the ARM suspend and resume code needs to be reworked to save more state. I'll try to cook up a patch tomorrow to fix it, but I'll need you to provide feedback. Please note that you may see other ARM breakage over the next month or so - I'm going to be concentrating on merging ARM SMP support, and whatever bashing other people like yourself can give the kernel will help ensure that problems are picked up quickly. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King @ 2005-06-19 1:20 ` Richard Purdie 2005-06-19 9:02 ` 2.6.12-rc6-mm1 Russell King 0 siblings, 1 reply; 72+ messages in thread From: Richard Purdie @ 2005-06-19 1:20 UTC (permalink / raw) To: Russell King; +Cc: LKML, Andrew Morton On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote: > On Sat, Jun 18, 2005 at 11:39:18PM +0100, Richard Purdie wrote: > > On Tue, 2005-06-07 at 04:29 -0700, Andrew Morton wrote: > > > +git-arm-smp.patch > > > > > > ARM git trees > > > > The arm pxa255 based Zaurus won't resume from a suspend with the patches > > from the above tree applied. The suspend looks normal and gets at least > > as far as pxa_pm_enter(). After that, the device appears to be dead and > > needs a battery removal to reset. I'm unsure if it actually suspends and > > is failing to resume or is crashing in the latter suspend stages. > > <grumble>Well, its a bit late for this since (a) stuff has rapidly > moved on at rmk towers since 2.6.12 was released this morning, and > (b) I've just asked Linus to pull this.</grumble> Please don't underestimate the time it takes to wade through all the patches in the -mm tree, find the one causing the breakage, investigate the patch and report it to the person concerned. I'm doing the Zaurus work in my spare time and don't get paid for it. Just reflashing and booting a new kernel probably takes ~15mins on the Zaurus. The copy/clearpage problem took a complete weekend to track down (as it was showing up randomly) and then needed further evenings to debug your patch which is a large chunk of my free time. The Checked-By: line didn't quite give the full picture. I realise its taken me a while to find enough time to test/debug this kernel but as least you now know there's a problem... > Thinking about what's probably happening, I suspect all the ARM suspend > and resume code needs to be reworked to save more state. I'll try to > cook up a patch tomorrow to fix it, but I'll need you to provide > feedback. Ok, thanks. I'm happy to test any fixes/patches. > Please note that you may see other ARM breakage over the next month > or so - I'm going to be concentrating on merging ARM SMP support, > and whatever bashing other people like yourself can give the kernel > will help ensure that problems are picked up quickly. In order to assist with that, can you publish these patches somewhere? That way, I can apply them against a known good Zaurus kernel tree and know straight away if they break anything (diff/patch format would be preferable as my Zaurus trees are all patch based). On a positive note, something in the later 2.6.12-rc kernels has made a massive difference to the speed on the Zaurus - I suspect the removal of the preempt locks on copy/clearpage. It boots up ~1.5x faster and the speed gain will make a lot of people very happy :) Richard ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-19 1:20 ` 2.6.12-rc6-mm1 Richard Purdie @ 2005-06-19 9:02 ` Russell King 2005-06-19 9:11 ` 2.6.12-rc6-mm1 Russell King 0 siblings, 1 reply; 72+ messages in thread From: Russell King @ 2005-06-19 9:02 UTC (permalink / raw) To: Richard Purdie; +Cc: LKML, Andrew Morton On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote: > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote: > > Thinking about what's probably happening, I suspect all the ARM suspend > > and resume code needs to be reworked to save more state. I'll try to > > cook up a patch tomorrow to fix it, but I'll need you to provide > > feedback. > > Ok, thanks. I'm happy to test any fixes/patches. This should resolve the problem - we now rely on the stack pointer for each CPU mode to remain constant throughout the running time of the kernel, which includes across suspend/resume cycles. --- a/arch/arm/mach-pxa/sleep.S +++ b/arch/arm/mach-pxa/sleep.S @@ -38,6 +38,16 @@ ENTRY(pxa_cpu_suspend) #endif stmfd sp!, {r2 - r12, lr} @ save registers on stack + @ preserve IRQ, abort and undefined mode stack pointers + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE + mov r4, sp + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE + mov r5, sp + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE + mov r6, sp + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE + stmfd sp!, {r4 - r6} + @ get coprocessor registers mrc p14, 0, r3, c6, c0, 0 @ clock configuration, for turbo mode mrc p15, 0, r4, c15, c1, 0 @ CP access reg @@ -229,6 +239,17 @@ resume_after_mmu: #ifdef CONFIG_XSCALE_CACHE_ERRATA bl cpu_xscale_proc_init #endif + + @ restore IRQ, abort and undefined mode stack pointers + ldmfd sp!, {r4 - r6} + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE + mov sp, r4 + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE + mov sp, r5 + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE + mov sp, r6 + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE + ldmfd sp!, {r2, r3} #ifndef CONFIG_IWMMXT mar acc0, r2, r3 --- a/arch/arm/mach-sa1100/sleep.S +++ b/arch/arm/mach-sa1100/sleep.S @@ -37,6 +37,16 @@ ENTRY(sa1100_cpu_suspend) stmfd sp!, {r4 - r12, lr} @ save registers on stack + @ preserve IRQ, abort and undefined mode stack pointers + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE + mov r4, sp + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE + mov r5, sp + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE + mov r6, sp + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE + stmfd sp!, {r4 - r6} + @ get coprocessor registers mrc p15, 0, r4, c3, c0, 0 @ domain ID mrc p15, 0, r5, c2, c0, 0 @ translation table base addr @@ -210,6 +220,17 @@ sleep_save_sp: .text resume_after_mmu: mcr p15, 0, r1, c15, c1, 2 @ enable clock switching + + @ restore IRQ, abort and undefined mode stack pointers + ldmfd sp!, {r4 - r6} + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | IRQ_MODE + mov sp, r4 + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | ABT_MODE + mov sp, r5 + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | UND_MODE + mov sp, r6 + msr cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE + ldmfd sp!, {r4 - r12, pc} @ return to caller > > Please note that you may see other ARM breakage over the next month > > or so - I'm going to be concentrating on merging ARM SMP support, > > and whatever bashing other people like yourself can give the kernel > > will help ensure that problems are picked up quickly. > > In order to assist with that, can you publish these patches somewhere? > That way, I can apply them against a known good Zaurus kernel tree and > know straight away if they break anything (diff/patch format would be > preferable as my Zaurus trees are all patch based). I'll see what I can do, but I'm going to be working fairly rapidly on merging this, so expect roughly a patch each day. Hopefully though, the later patches will only affect the Integrator platform. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-19 9:02 ` 2.6.12-rc6-mm1 Russell King @ 2005-06-19 9:11 ` Russell King 2005-06-19 17:12 ` 2.6.12-rc6-mm1 Richard Purdie 0 siblings, 1 reply; 72+ messages in thread From: Russell King @ 2005-06-19 9:11 UTC (permalink / raw) To: Richard Purdie, LKML, Andrew Morton On Sun, Jun 19, 2005 at 10:02:26AM +0100, Russell King wrote: > On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote: > > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote: > > > Thinking about what's probably happening, I suspect all the ARM suspend > > > and resume code needs to be reworked to save more state. I'll try to > > > cook up a patch tomorrow to fix it, but I'll need you to provide > > > feedback. > > > > Ok, thanks. I'm happy to test any fixes/patches. > > This should resolve the problem - we now rely on the stack pointer for > each CPU mode to remain constant throughout the running time of the > kernel, which includes across suspend/resume cycles. Actually, this patch is probably an all-round better solution. --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -328,7 +328,7 @@ static void __init setup_processor(void) * cpu_init dumps the cache information, initialises SMP specific * information, and sets up the per-CPU stacks. */ -void __init cpu_init(void) +void cpu_init(void) { unsigned int cpu = smp_processor_id(); struct stack *stk = &stacks[cpu]; --- a/arch/arm/mach-pxa/pm.c +++ b/arch/arm/mach-pxa/pm.c @@ -133,6 +133,8 @@ static int pxa_pm_enter(suspend_state_t /* *** go zzz *** */ pxa_cpu_pm_enter(state); + cpu_init(); + /* after sleeping, validate the checksum */ checksum = 0; for (i = 0; i < SLEEP_SAVE_SIZE - 1; i++) --- a/arch/arm/mach-sa1100/pm.c +++ b/arch/arm/mach-sa1100/pm.c @@ -88,6 +88,8 @@ static int sa11x0_pm_enter(suspend_state /* go zzz */ sa1100_cpu_suspend(); + cpu_init(); + /* * Ensure not to come back here if it wasn't intended */ --- a/include/asm-arm/system.h +++ b/include/asm-arm/system.h @@ -104,6 +104,7 @@ extern void show_pte(struct mm_struct *m extern void __show_regs(struct pt_regs *); extern int cpu_architecture(void); +extern void cpu_init(void); #define set_cr(x) \ __asm__ __volatile__( \ -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-19 9:11 ` 2.6.12-rc6-mm1 Russell King @ 2005-06-19 17:12 ` Richard Purdie 2005-06-19 17:39 ` 2.6.12-rc6-mm1 Russell King 0 siblings, 1 reply; 72+ messages in thread From: Richard Purdie @ 2005-06-19 17:12 UTC (permalink / raw) To: Russell King; +Cc: LKML, Andrew Morton On Sun, 2005-06-19 at 10:11 +0100, Russell King wrote: > On Sun, Jun 19, 2005 at 10:02:26AM +0100, Russell King wrote: > > On Sun, Jun 19, 2005 at 02:20:48AM +0100, Richard Purdie wrote: > > > On Sun, 2005-06-19 at 00:18 +0100, Russell King wrote: > > > > Thinking about what's probably happening, I suspect all the ARM suspend > > > > and resume code needs to be reworked to save more state. I'll try to > > > > cook up a patch tomorrow to fix it, but I'll need you to provide > > > > feedback. > > > > > > Ok, thanks. I'm happy to test any fixes/patches. > > > > This should resolve the problem - we now rely on the stack pointer for > > each CPU mode to remain constant throughout the running time of the > > kernel, which includes across suspend/resume cycles. > > Actually, this patch is probably an all-round better solution. This patch (the simpler of the two using cpu_init()) allows the pxa to suspend/resume happily with the git-arm-smp.patch applied. Richard ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-19 17:12 ` 2.6.12-rc6-mm1 Richard Purdie @ 2005-06-19 17:39 ` Russell King 2005-06-19 18:25 ` 2.6.12-rc6-mm1 Richard Purdie 0 siblings, 1 reply; 72+ messages in thread From: Russell King @ 2005-06-19 17:39 UTC (permalink / raw) To: Richard Purdie; +Cc: LKML, Andrew Morton On Sun, Jun 19, 2005 at 06:12:38PM +0100, Richard Purdie wrote: > This patch (the simpler of the two using cpu_init()) allows the pxa to > suspend/resume happily with the git-arm-smp.patch applied. Good. Fix committed. Next batched smp patch can be found at www.home.arm.../~rmk/nightly which I'm currently planning to go to Linus tonight. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-19 17:39 ` 2.6.12-rc6-mm1 Russell King @ 2005-06-19 18:25 ` Richard Purdie 2005-06-19 18:56 ` 2.6.12-rc6-mm1 Russell King 0 siblings, 1 reply; 72+ messages in thread From: Richard Purdie @ 2005-06-19 18:25 UTC (permalink / raw) To: Russell King; +Cc: LKML, Andrew Morton On Sun, 2005-06-19 at 18:39 +0100, Russell King wrote: > Good. Fix committed. Thanks. > Next batched smp patch can be found at www.home.arm.../~rmk/nightly > which I'm currently planning to go to Linus tonight. I applied smp-20050619.patch to 2.6.12-rc6-mm1 + the last fix and the Zaurus seems perfectly happy with it. Let me know as and when you have further releases that need testing (a message to linux-arm-kernel might be the best way to announce them?). Richard ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-19 18:25 ` 2.6.12-rc6-mm1 Richard Purdie @ 2005-06-19 18:56 ` Russell King 0 siblings, 0 replies; 72+ messages in thread From: Russell King @ 2005-06-19 18:56 UTC (permalink / raw) To: Richard Purdie; +Cc: LKML, Andrew Morton On Sun, Jun 19, 2005 at 07:25:59PM +0100, Richard Purdie wrote: > On Sun, 2005-06-19 at 18:39 +0100, Russell King wrote: > > Next batched smp patch can be found at www.home.arm.../~rmk/nightly > > which I'm currently planning to go to Linus tonight. > > I applied smp-20050619.patch to 2.6.12-rc6-mm1 + the last fix and the > Zaurus seems perfectly happy with it. Let me know as and when you have > further releases that need testing (a message to linux-arm-kernel might > be the best way to announce them?). Thanks for testing. Most of the other patches are platform specific so this may not be required. However, if there are other changes to non-platform specific, I'll try to point them out a couple of days before they get merged. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton ` (6 preceding siblings ...) 2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie @ 2005-06-21 13:20 ` Dominik Karall 2005-06-24 21:27 ` 2.6.12-rc6-mm1 Alexey Dobriyan 2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton 7 siblings, 2 replies; 72+ messages in thread From: Dominik Karall @ 2005-06-21 13:20 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 1169 bytes --] On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2. >6.12-rc6-mm1/ After looking in my dmesg output today, I saw following error with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly happens, cause I never used mono last time, I just did an emerge mono on my gentoo system, maybe this forced the failure. note: mono[26736] exited with preempt_count 1 scheduling while atomic: mono/0x10000001/26736 Call Trace:<ffffffff803e13ea>{schedule+122} <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56} <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438} <ffffffff8013bf25>{__dequeue_signal+501} <ffffffff801340c8>{do_group_exit+280} <ffffffff8013e147>{get_signal_to_deliver+1575} <ffffffff8010de92>{do_signal+162} <ffffffff8012d1e0>{default_wake_function+0} <ffffffff8010e8e1>{sys_rt_sigreturn+577} <ffffffff8010eb3f>{sysret_signal+28} <ffffffff8010ee27>{ptregscall_common+103} cheers, dominik [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall @ 2005-06-24 21:27 ` Alexey Dobriyan 2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton 1 sibling, 0 replies; 72+ messages in thread From: Alexey Dobriyan @ 2005-06-24 21:27 UTC (permalink / raw) To: Dominik Karall; +Cc: Andrew Morton, linux-kernel On Tuesday 21 June 2005 17:20, Dominik Karall wrote: > After looking in my dmesg output today, I saw following error with > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly > happens, cause I never used mono last time, I just did an emerge mono on my > gentoo system, maybe this forced the failure. > > note: mono[26736] exited with preempt_count 1 > scheduling while atomic: mono/0x10000001/26736 I've filed a bug at kernel bugzilla, so your report won't be lost. See http://bugme.osdl.org/show_bug.cgi?id=4794 You can register at http://bugme.osdl.org/createaccount.cgi and add yourself to CC list. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall 2005-06-24 21:27 ` 2.6.12-rc6-mm1 Alexey Dobriyan @ 2005-07-29 4:54 ` Andrew Morton 2005-07-29 13:39 ` 2.6.12-rc6-mm1 Dominik Karall 1 sibling, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-07-29 4:54 UTC (permalink / raw) To: Dominik Karall; +Cc: linux-kernel Dominik Karall <dominik.karall@gmx.net> wrote: > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc6/2. > >6.12-rc6-mm1/ > > After looking in my dmesg output today, I saw following error with > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly > happens, cause I never used mono last time, I just did an emerge mono on my > gentoo system, maybe this forced the failure. > > note: mono[26736] exited with preempt_count 1 > scheduling while atomic: mono/0x10000001/26736 > > Call Trace:<ffffffff803e13ea>{schedule+122} <ffffffff8013197b>{vprintk+635} > <ffffffff803e2738>{cond_resched+56} <ffffffff80164de3>{unmap_vmas+1587} > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31} > <ffffffff80133466>{do_exit+438} > <ffffffff8013bf25>{__dequeue_signal+501} > <ffffffff801340c8>{do_group_exit+280} > <ffffffff8013e147>{get_signal_to_deliver+1575} > <ffffffff8010de92>{do_signal+162} > <ffffffff8012d1e0>{default_wake_function+0} > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > <ffffffff8010eb3f>{sysret_signal+28} > <ffffffff8010ee27>{ptregscall_common+103} > A couple of people reported this, but all seems to have gone quiet. Is it fixed in later -mm's? Is 2.6.13-rc4 running OK? Thanks. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-07-29 13:39 ` Dominik Karall 2005-07-29 18:22 ` 2.6.12-rc6-mm1 Andrew Morton 0 siblings, 1 reply; 72+ messages in thread From: Dominik Karall @ 2005-07-29 13:39 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2179 bytes --] On Friday 29 July 2005 06:54, Andrew Morton wrote: > Dominik Karall <dominik.karall@gmx.net> wrote: > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc > > >6/2. 6.12-rc6-mm1/ > > > > After looking in my dmesg output today, I saw following error with > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly > > happens, cause I never used mono last time, I just did an emerge mono on > > my gentoo system, maybe this forced the failure. > > > > note: mono[26736] exited with preempt_count 1 > > scheduling while atomic: mono/0x10000001/26736 > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56} > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128} > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438} > > <ffffffff8013bf25>{__dequeue_signal+501} > > <ffffffff801340c8>{do_group_exit+280} > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > <ffffffff8010de92>{do_signal+162} > > <ffffffff8012d1e0>{default_wake_function+0} > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > <ffffffff8010eb3f>{sysret_signal+28} > > <ffffffff8010ee27>{ptregscall_common+103} > > A couple of people reported this, but all seems to have gone quiet. Is it > fixed in later -mm's? Is 2.6.13-rc4 running OK? > > Thanks. hi andrew! I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge mono right now to test it, and I got this one: Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4 DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info about the bug. Did I forget any debug option? greets, dominik [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-07-29 13:39 ` 2.6.12-rc6-mm1 Dominik Karall @ 2005-07-29 18:22 ` Andrew Morton 2005-07-29 21:19 ` 2.6.12-rc6-mm1 Dominik Karall 0 siblings, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-07-29 18:22 UTC (permalink / raw) To: Dominik Karall; +Cc: linux-kernel Dominik Karall <dominik.karall@gmx.net> wrote: > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc > > > >6/2. 6.12-rc6-mm1/ > > > > > > After looking in my dmesg output today, I saw following error with > > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it exactly > > > happens, cause I never used mono last time, I just did an emerge mono on > > > my gentoo system, maybe this forced the failure. > > > > > > note: mono[26736] exited with preempt_count 1 > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56} > > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128} > > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438} > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > <ffffffff801340c8>{do_group_exit+280} > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > <ffffffff8010de92>{do_signal+162} > > > <ffffffff8012d1e0>{default_wake_function+0} > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > <ffffffff8010eb3f>{sysret_signal+28} > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > A couple of people reported this, but all seems to have gone quiet. Is it > > fixed in later -mm's? Is 2.6.13-rc4 running OK? > > > > Thanks. > > hi andrew! > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge mono > right now to test it, and I got this one: > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1 > Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip > 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip > 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip > 00002aaaaaf652cf rsp 00007fffff905f80 error 4 > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info about > the bug. Did I forget any debug option? Gee, I don't know how to find this one. Do you know if the problem is specific to -mm? ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-07-29 18:22 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-07-29 21:19 ` Dominik Karall 2005-07-29 21:27 ` 2.6.12-rc6-mm1 Andrew Morton 0 siblings, 1 reply; 72+ messages in thread From: Dominik Karall @ 2005-07-29 21:19 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 2706 bytes --] On Friday 29 July 2005 20:22, Andrew Morton wrote: > Dominik Karall <dominik.karall@gmx.net> wrote: > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1 > > > > >2-rc 6/2. 6.12-rc6-mm1/ > > > > > > > > After looking in my dmesg output today, I saw following error with > > > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it > > > > exactly happens, cause I never used mono last time, I just did an > > > > emerge mono on my gentoo system, maybe this forced the failure. > > > > > > > > note: mono[26736] exited with preempt_count 1 > > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56} > > > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128} > > > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438} > > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > > <ffffffff801340c8>{do_group_exit+280} > > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > > <ffffffff8010de92>{do_signal+162} > > > > <ffffffff8012d1e0>{default_wake_function+0} > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > > <ffffffff8010eb3f>{sysret_signal+28} > > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > > > A couple of people reported this, but all seems to have gone quiet. Is > > > it fixed in later -mm's? Is 2.6.13-rc4 running OK? > > > > > > Thanks. > > > > hi andrew! > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge > > mono right now to test it, and I got this one: > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1 > > Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip > > 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip > > 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip > > 00002aaaaaf652cf rsp 00007fffff905f80 error 4 > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info > > about the bug. Did I forget any debug option? > > Gee, I don't know how to find this one. Do you know if the problem is > specific to -mm? Tested with 2.6.13-rc4 and it seems to work. Didn't get any error. So it seems to be -mm related. Do you suspect any patch which could cause the error? dominik [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-07-29 21:19 ` 2.6.12-rc6-mm1 Dominik Karall @ 2005-07-29 21:27 ` Andrew Morton 2005-07-29 21:37 ` 2.6.12-rc6-mm1 Dominik Karall 0 siblings, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-07-29 21:27 UTC (permalink / raw) To: Dominik Karall; +Cc: linux-kernel Dominik Karall <dominik.karall@gmx.net> wrote: > > On Friday 29 July 2005 20:22, Andrew Morton wrote: > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.1 > > > > > >2-rc 6/2. 6.12-rc6-mm1/ > > > > > > > > > > After looking in my dmesg output today, I saw following error with > > > > > 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when it > > > > > exactly happens, cause I never used mono last time, I just did an > > > > > emerge mono on my gentoo system, maybe this forced the failure. > > > > > > > > > > note: mono[26736] exited with preempt_count 1 > > > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > > > <ffffffff8013197b>{vprintk+635} <ffffffff803e2738>{cond_resched+56} > > > > > <ffffffff80164de3>{unmap_vmas+1587} <ffffffff8016a560>{exit_mmap+128} > > > > > <ffffffff8012e7bf>{mmput+31} <ffffffff80133466>{do_exit+438} > > > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > > > <ffffffff801340c8>{do_group_exit+280} > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > > > <ffffffff8010de92>{do_signal+162} > > > > > <ffffffff8012d1e0>{default_wake_function+0} > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > > > <ffffffff8010eb3f>{sysret_signal+28} > > > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > > > > > A couple of people reported this, but all seems to have gone quiet. Is > > > > it fixed in later -mm's? Is 2.6.13-rc4 running OK? > > > > > > > > Thanks. > > > > > > hi andrew! > > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an emerge > > > mono right now to test it, and I got this one: > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count 1 > > > Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 rip > > > 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 rip > > > 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 rip > > > 00002aaaaaf652cf rsp 00007fffff905f80 error 4 > > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more info > > > about the bug. Did I forget any debug option? > > > > Gee, I don't know how to find this one. Do you know if the problem is > > specific to -mm? > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error. Great, thanks for that. > So it seems to be -mm related. Do you suspect any patch which could cause the > error? I wouldn't know, sorry. Possible the scheduler patches, possibly an x86_64-specific patch. Is the problem repeatable? If so, a binary search would only take ten build-n-boots ;) ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-07-29 21:27 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-07-29 21:37 ` Dominik Karall 2005-08-04 19:44 ` 2.6.12-rc6-mm1 Andrew Morton 0 siblings, 1 reply; 72+ messages in thread From: Dominik Karall @ 2005-07-29 21:37 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 3597 bytes --] On Friday 29 July 2005 23:27, Andrew Morton wrote: > Dominik Karall <dominik.karall@gmx.net> wrote: > > On Friday 29 July 2005 20:22, Andrew Morton wrote: > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 > > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/ > > > > > > > > > > > > After looking in my dmesg output today, I saw following error > > > > > > with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when > > > > > > it exactly happens, cause I never used mono last time, I just did > > > > > > an emerge mono on my gentoo system, maybe this forced the > > > > > > failure. > > > > > > > > > > > > note: mono[26736] exited with preempt_count 1 > > > > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > > > > <ffffffff8013197b>{vprintk+635} > > > > > > <ffffffff803e2738>{cond_resched+56} > > > > > > <ffffffff80164de3>{unmap_vmas+1587} > > > > > > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31} > > > > > > <ffffffff80133466>{do_exit+438} > > > > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > > > > <ffffffff801340c8>{do_group_exit+280} > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > > > > <ffffffff8010de92>{do_signal+162} > > > > > > <ffffffff8012d1e0>{default_wake_function+0} > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > > > > <ffffffff8010eb3f>{sysret_signal+28} > > > > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > > > > > > > A couple of people reported this, but all seems to have gone quiet. > > > > > Is it fixed in later -mm's? Is 2.6.13-rc4 running OK? > > > > > > > > > > Thanks. > > > > > > > > hi andrew! > > > > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an > > > > emerge mono right now to test it, and I got this one: > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count > > > > 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 > > > > rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 > > > > rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 > > > > rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4 > > > > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more > > > > info about the bug. Did I forget any debug option? > > > > > > Gee, I don't know how to find this one. Do you know if the problem is > > > specific to -mm? > > > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error. > > Great, thanks for that. > > > So it seems to be -mm related. Do you suspect any patch which could cause > > the error? > > I wouldn't know, sorry. Possible the scheduler patches, possibly an > x86_64-specific patch. Is the problem repeatable? If so, a binary search > would only take ten build-n-boots ;) Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try to find the right patch tomorrow, 10 build-n-boots would end up in morning ;) btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which wasn't merged to linus tree till now...hope there aren't a lot of them :) [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-07-29 21:37 ` 2.6.12-rc6-mm1 Dominik Karall @ 2005-08-04 19:44 ` Andrew Morton 2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton 0 siblings, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-08-04 19:44 UTC (permalink / raw) To: Dominik Karall; +Cc: linux-kernel Dominik Karall <dominik.karall@gmx.net> wrote: > > On Friday 29 July 2005 23:27, Andrew Morton wrote: > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > On Friday 29 July 2005 20:22, Andrew Morton wrote: > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 > > > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/ > > > > > > > > > > > > > > After looking in my dmesg output today, I saw following error > > > > > > > with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when > > > > > > > it exactly happens, cause I never used mono last time, I just did > > > > > > > an emerge mono on my gentoo system, maybe this forced the > > > > > > > failure. > > > > > > > > > > > > > > note: mono[26736] exited with preempt_count 1 > > > > > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > > > > > <ffffffff8013197b>{vprintk+635} > > > > > > > <ffffffff803e2738>{cond_resched+56} > > > > > > > <ffffffff80164de3>{unmap_vmas+1587} > > > > > > > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31} > > > > > > > <ffffffff80133466>{do_exit+438} > > > > > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > > > > > <ffffffff801340c8>{do_group_exit+280} > > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > > > > > <ffffffff8010de92>{do_signal+162} > > > > > > > <ffffffff8012d1e0>{default_wake_function+0} > > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > > > > > <ffffffff8010eb3f>{sysret_signal+28} > > > > > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > > > > > > > > > A couple of people reported this, but all seems to have gone quiet. > > > > > > Is it fixed in later -mm's? Is 2.6.13-rc4 running OK? > > > > > > > > > > > > Thanks. > > > > > > > > > > hi andrew! > > > > > > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an > > > > > emerge mono right now to test it, and I got this one: > > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count > > > > > 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 > > > > > rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 > > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 > > > > > rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 > > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 > > > > > rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4 > > > > > > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more > > > > > info about the bug. Did I forget any debug option? > > > > > > > > Gee, I don't know how to find this one. Do you know if the problem is > > > > specific to -mm? > > > > > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error. > > > > Great, thanks for that. > > > > > So it seems to be -mm related. Do you suspect any patch which could cause > > > the error? > > > > I wouldn't know, sorry. Possible the scheduler patches, possibly an > > x86_64-specific patch. Is the problem repeatable? If so, a binary search > > would only take ten build-n-boots ;) > > Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try > to find the right patch tomorrow, 10 build-n-boots would end up in morning ;) > > btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which > wasn't merged to linus tree till now...hope there aren't a lot of them :) > Any progress on this? It kinda measn that the whole of the -mm lineup is stuck until we can identify the offending patch. We have a couple of weeks in which to do this but if you can identify the bad patch it'd help enormously, thanks. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-08-04 19:44 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-08-04 22:28 ` Andrew Morton 2005-08-04 22:44 ` 2.6.12-rc6-mm1 Dominik Karall 0 siblings, 1 reply; 72+ messages in thread From: Andrew Morton @ 2005-08-04 22:28 UTC (permalink / raw) To: dominik.karall, linux-kernel; +Cc: Ingo Molnar Andrew Morton <akpm@osdl.org> wrote: > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > On Friday 29 July 2005 23:27, Andrew Morton wrote: > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > On Friday 29 July 2005 20:22, Andrew Morton wrote: > > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > > > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2 > > > > > > > > >.6.1 2-rc 6/2. 6.12-rc6-mm1/ > > > > > > > > > > > > > > > > After looking in my dmesg output today, I saw following error > > > > > > > > with 2.6.12-rc6-mm1, maybe it's usefull to you. I don't know when > > > > > > > > it exactly happens, cause I never used mono last time, I just did > > > > > > > > an emerge mono on my gentoo system, maybe this forced the > > > > > > > > failure. > > > > > > > > > > > > > > > > note: mono[26736] exited with preempt_count 1 > > > > > > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > > > > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > > > > > > <ffffffff8013197b>{vprintk+635} > > > > > > > > <ffffffff803e2738>{cond_resched+56} > > > > > > > > <ffffffff80164de3>{unmap_vmas+1587} > > > > > > > > <ffffffff8016a560>{exit_mmap+128} <ffffffff8012e7bf>{mmput+31} > > > > > > > > <ffffffff80133466>{do_exit+438} > > > > > > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > > > > > > <ffffffff801340c8>{do_group_exit+280} > > > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > > > > > > <ffffffff8010de92>{do_signal+162} > > > > > > > > <ffffffff8012d1e0>{default_wake_function+0} > > > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > > > > > > <ffffffff8010eb3f>{sysret_signal+28} > > > > > > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > > > > > > > > > > > A couple of people reported this, but all seems to have gone quiet. > > > > > > > Is it fixed in later -mm's? Is 2.6.13-rc4 running OK? > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > hi andrew! > > > > > > > > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did an > > > > > > emerge mono right now to test it, and I got this one: > > > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with preempt_count > > > > > > 1 Jul 29 15:26:50 [kernel] file[14627]: segfault at 00002aaaab453000 > > > > > > rip 00002aaaaaf652cf rsp 00007fffffe43b50 error 4 > > > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at 00002aaaab453000 > > > > > > rip 00002aaaaaf652cf rsp 00007fffffcc87a0 error 4 > > > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at 00002aaaab453000 > > > > > > rip 00002aaaaaf652cf rsp 00007fffff905f80 error 4 > > > > > > > > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get more > > > > > > info about the bug. Did I forget any debug option? > > > > > > > > > > Gee, I don't know how to find this one. Do you know if the problem is > > > > > specific to -mm? > > > > > > > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error. > > > > > > Great, thanks for that. > > > > > > > So it seems to be -mm related. Do you suspect any patch which could cause > > > > the error? > > > > > > I wouldn't know, sorry. Possible the scheduler patches, possibly an > > > x86_64-specific patch. Is the problem repeatable? If so, a binary search > > > would only take ten build-n-boots ;) > > > > Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I will try > > to find the right patch tomorrow, 10 build-n-boots would end up in morning ;) > > > > btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old patch which > > wasn't merged to linus tree till now...hope there aren't a lot of them :) > > > > Any progress on this? It kinda measn that the whole of the -mm lineup is > stuck until we can identify the offending patch. We have a couple of weeks > in which to do this but if you can identify the bad patch it'd help > enormously, thanks. > OK, Bartosz Taudul tells me that he's occasionally seeing this on stock 2.6.12 (thanks!). So there's not a lot of point in doing the -mm bisection search. I think Ingo was planning on coming up with some infrastructure which would allow us to debug this further. ^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: 2.6.12-rc6-mm1 2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton @ 2005-08-04 22:44 ` Dominik Karall 0 siblings, 0 replies; 72+ messages in thread From: Dominik Karall @ 2005-08-04 22:44 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Ingo Molnar [-- Attachment #1: Type: text/plain, Size: 5121 bytes --] On Friday 05 August 2005 00:28, Andrew Morton wrote: > Andrew Morton <akpm@osdl.org> wrote: > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > On Friday 29 July 2005 23:27, Andrew Morton wrote: > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > On Friday 29 July 2005 20:22, Andrew Morton wrote: > > > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > > > On Friday 29 July 2005 06:54, Andrew Morton wrote: > > > > > > > > Dominik Karall <dominik.karall@gmx.net> wrote: > > > > > > > > > On Tuesday 07 June 2005 13:29, Andrew Morton wrote: > > > > > > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches > > > > > > > > > >/2.6/2 .6.1 2-rc 6/2. 6.12-rc6-mm1/ > > > > > > > > > > > > > > > > > > After looking in my dmesg output today, I saw following > > > > > > > > > error with 2.6.12-rc6-mm1, maybe it's usefull to you. I > > > > > > > > > don't know when it exactly happens, cause I never used mono > > > > > > > > > last time, I just did an emerge mono on my gentoo system, > > > > > > > > > maybe this forced the failure. > > > > > > > > > > > > > > > > > > note: mono[26736] exited with preempt_count 1 > > > > > > > > > scheduling while atomic: mono/0x10000001/26736 > > > > > > > > > > > > > > > > > > Call Trace:<ffffffff803e13ea>{schedule+122} > > > > > > > > > <ffffffff8013197b>{vprintk+635} > > > > > > > > > <ffffffff803e2738>{cond_resched+56} > > > > > > > > > <ffffffff80164de3>{unmap_vmas+1587} > > > > > > > > > <ffffffff8016a560>{exit_mmap+128} > > > > > > > > > <ffffffff8012e7bf>{mmput+31} > > > > > > > > > <ffffffff80133466>{do_exit+438} > > > > > > > > > <ffffffff8013bf25>{__dequeue_signal+501} > > > > > > > > > <ffffffff801340c8>{do_group_exit+280} > > > > > > > > > <ffffffff8013e147>{get_signal_to_deliver+1575} > > > > > > > > > <ffffffff8010de92>{do_signal+162} > > > > > > > > > <ffffffff8012d1e0>{default_wake_function+0} > > > > > > > > > <ffffffff8010e8e1>{sys_rt_sigreturn+577} > > > > > > > > > <ffffffff8010eb3f>{sysret_signal+28} > > > > > > > > > <ffffffff8010ee27>{ptregscall_common+103} > > > > > > > > > > > > > > > > A couple of people reported this, but all seems to have gone > > > > > > > > quiet. Is it fixed in later -mm's? Is 2.6.13-rc4 running > > > > > > > > OK? > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > hi andrew! > > > > > > > > > > > > > > I'm sorry, but it's not fixed in current 2.6.13-rc3-mm3. I did > > > > > > > an emerge mono right now to test it, and I got this one: > > > > > > > Jul 29 15:26:37 [kernel] note: mono[11138] exited with > > > > > > > preempt_count 1 Jul 29 15:26:50 [kernel] file[14627]: segfault > > > > > > > at 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffffe43b50 > > > > > > > error 4 > > > > > > > Jul 29 15:26:50 [kernel] file[14633]: segfault at > > > > > > > 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffffcc87a0 > > > > > > > error 4 > > > > > > > Jul 29 15:26:51 [kernel] file[14669]: segfault at > > > > > > > 00002aaaab453000 rip 00002aaaaaf652cf rsp 00007fffff905f80 > > > > > > > error 4 > > > > > > > > > > > > > > DEBUG_KERNEL/ PREEMPT/ SPINLOCK are enabled, but I didn't get > > > > > > > more info about the bug. Did I forget any debug option? > > > > > > > > > > > > Gee, I don't know how to find this one. Do you know if the > > > > > > problem is specific to -mm? > > > > > > > > > > Tested with 2.6.13-rc4 and it seems to work. Didn't get any error. > > > > > > > > Great, thanks for that. > > > > > > > > > So it seems to be -mm related. Do you suspect any patch which could > > > > > cause the error? > > > > > > > > I wouldn't know, sorry. Possible the scheduler patches, possibly an > > > > x86_64-specific patch. Is the problem repeatable? If so, a binary > > > > search would only take ten build-n-boots ;) > > > > > > Yes, it is repeatable. I tested on lastest -mm about 4 times. Ok, I > > > will try to find the right patch tomorrow, 10 build-n-boots would end > > > up in morning ;) > > > > > > btw, as the error occured in 2.6.12-rc6-mm1 too, it must be an old > > > patch which wasn't merged to linus tree till now...hope there aren't a > > > lot of them :) > > > > Any progress on this? It kinda measn that the whole of the -mm lineup is > > stuck until we can identify the offending patch. We have a couple of > > weeks in which to do this but if you can identify the bad patch it'd help > > enormously, thanks. > > OK, Bartosz Taudul tells me that he's occasionally seeing this on stock > 2.6.12 (thanks!). So there's not a lot of point in doing the -mm bisection > search. > > I think Ingo was planning on coming up with some infrastructure which would > allow us to debug this further. I'm sorry that I couldn't do the tests earlier, but I had no time this week. I did some tests now and noticed that the bug only occures when kde is running...weird. I'm going to continue testing tomorrow after work, exactly in 12 hours ;) I will let you know if I have any news! dominik [-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --] ^ permalink raw reply [flat|nested] 72+ messages in thread
end of thread, other threads:[~2005-08-04 22:44 UTC | newest] Thread overview: 72+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-06-07 23:50 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-07 23:56 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Christoph Lameter 2005-06-08 0:08 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 3:17 ` 2.6.12-rc6-mm1 Nick Piggin 2005-06-08 3:33 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-08 3:50 ` 2.6.12-rc6-mm1 Nick Piggin 2005-06-08 14:15 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-09 23:56 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-10 7:02 ` 2.6.12-rc6-mm1 Ingo Molnar 2005-06-10 12:03 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-10 14:19 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-10 23:14 ` 2.6.12-rc6-mm1 J.A. Magallon 2005-06-10 23:59 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 0:18 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 0:32 ` 2.6.12-rc6-mm1 J.A. Magallon 2005-06-11 0:48 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 0:52 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-10 23:50 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-11 4:14 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-11 5:22 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 5:56 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-11 20:13 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-11 22:20 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-11 23:27 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-11 23:47 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-12 0:23 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-12 5:19 ` 2.6.12-rc6-mm1 Con Kolivas 2005-06-09 1:58 ` 2.6.12-rc6-mm1 Lee Revell 2005-06-08 0:02 ` 2.6.12-rc6-mm1 Martin J. Bligh -- strict thread matches above, loose matches on Subject: below -- 2005-06-07 11:29 2.6.12-rc6-mm1 Andrew Morton 2005-06-07 14:24 ` 2.6.12-rc6-mm1 Wolfgang Wander 2005-06-07 14:49 ` 2.6.12-rc6-mm1 Wolfgang Wander 2005-06-07 14:48 ` 2.6.12-rc6-mm1 Brice Goglin 2005-06-07 23:15 ` 2.6.12-rc6-mm1 Francois Romieu 2005-06-08 1:59 ` 2.6.12-rc6-mm1 Søren Lott 2005-06-08 5:53 ` 2.6.12-rc6-mm1 Jean Delvare 2005-06-08 7:08 ` 2.6.12-rc6-mm1 Søren Lott 2005-06-08 14:22 ` 2.6.12-rc6-mm1 Andy Whitcroft 2005-06-08 20:01 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 23:14 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-08 23:22 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-08 23:34 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-09 7:17 ` 2.6.12-rc6-mm1 Kirill Korotaev 2005-06-09 13:38 ` 2.6.12-rc6-mm1 Martin J. Bligh 2005-06-10 12:12 ` 2.6.12-rc6-mm1 Kirill Korotaev 2005-06-09 4:27 ` 2.6.12-rc6-mm1 Andrey Panin 2005-06-09 13:12 ` 2.6.12-rc6-mm1 Andy Whitcroft 2005-06-11 11:51 ` 2.6.12-rc6-mm1 Benoit Boissinot 2005-06-18 22:39 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-18 22:44 ` 2.6.12-rc6-mm1 Andrew Morton 2005-06-18 22:57 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-18 23:11 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-18 23:18 ` 2.6.12-rc6-mm1 Russell King 2005-06-19 1:20 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-19 9:02 ` 2.6.12-rc6-mm1 Russell King 2005-06-19 9:11 ` 2.6.12-rc6-mm1 Russell King 2005-06-19 17:12 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-19 17:39 ` 2.6.12-rc6-mm1 Russell King 2005-06-19 18:25 ` 2.6.12-rc6-mm1 Richard Purdie 2005-06-19 18:56 ` 2.6.12-rc6-mm1 Russell King 2005-06-21 13:20 ` 2.6.12-rc6-mm1 Dominik Karall 2005-06-24 21:27 ` 2.6.12-rc6-mm1 Alexey Dobriyan 2005-07-29 4:54 ` 2.6.12-rc6-mm1 Andrew Morton 2005-07-29 13:39 ` 2.6.12-rc6-mm1 Dominik Karall 2005-07-29 18:22 ` 2.6.12-rc6-mm1 Andrew Morton 2005-07-29 21:19 ` 2.6.12-rc6-mm1 Dominik Karall 2005-07-29 21:27 ` 2.6.12-rc6-mm1 Andrew Morton 2005-07-29 21:37 ` 2.6.12-rc6-mm1 Dominik Karall 2005-08-04 19:44 ` 2.6.12-rc6-mm1 Andrew Morton 2005-08-04 22:28 ` 2.6.12-rc6-mm1 Andrew Morton 2005-08-04 22:44 ` 2.6.12-rc6-mm1 Dominik Karall
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox