* perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); @ 2009-05-13 16:54 Srivatsa Vaddagiri 2009-05-13 16:57 ` Srivatsa Vaddagiri 0 siblings, 1 reply; 10+ messages in thread From: Srivatsa Vaddagiri @ 2009-05-13 16:54 UTC (permalink / raw) To: Ingo Molnar, a.p.zijlstra; +Cc: linux-kernel I am hitting this warning in kernel/exit.c (latest -tip code): WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); I was using perf counters against a java benchmark. Any suggestions for me to try out? Full stack-traceback is below: WARNING: at kernel/exit.c:158 delayed_put_task_struct+0x34/0x7c() Hardware name: IBM System x3550 -[79787AA]- Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_mirror dm_multipath scsi_dh sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg sr_mod bnx2 ide_cd_mod cdrom button rtc_cmos rtc_core rtc_lib i2c_i801 i2c_core i5000_edac edac_core pcspkr dm_region_hash dm_log dm_mod usb_storage ata_piix libata shpchp aacraid sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 0, comm: swapper Not tainted 2.6.30-rc5-tip #12 Call Trace: <IRQ> [<ffffffff8023eab8>] ? warn_slowpath_fmt+0xd9/0xf6 [<ffffffff80351233>] ? cpumask_next_and+0x2a/0x3b [<ffffffff80233de3>] ? find_busiest_group+0x1e6/0x82a [<ffffffff802535fc>] ? lock_hrtimer_base+0x1b/0x3c [<ffffffff802886d6>] ? __perf_counter_sched_in+0x1c/0x143 [<ffffffff802181f0>] ? x86_pmu_enable+0x136/0x161 [<ffffffff80287a19>] ? counter_sched_in+0x26/0x8e [<ffffffff8028804d>] ? group_sched_in+0x54/0xe2 [<ffffffff80256ec6>] ? getnstimeofday+0x56/0xb5 [<ffffffff80240193>] ? delayed_put_task_struct+0x34/0x7c [<ffffffff80272af6>] ? __rcu_process_callbacks+0x101/0x1ac [<ffffffff80272bc7>] ? rcu_process_callbacks+0x26/0x4a [<ffffffff80243eed>] ? __do_softirq+0xa3/0x162 [<ffffffff8020ca3c>] ? call_softirq+0x1c/0x28 [<ffffffff8020dcc6>] ? do_softirq+0x2c/0x68 [<ffffffff8021e7b8>] ? smp_apic_timer_interrupt+0x8d/0x9d [<ffffffff8020c413>] ? apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff80211ed0>] ? mwait_idle+0xa3/0xd1 [<ffffffff804cd309>] ? notifier_call_chain+0x29/0x4c [<ffffffff8020aa1d>] ? cpu_idle+0x40/0x5e ---[ end trace 026134fcfb5c588d ]--- WARNING: at kernel/exit.c:158 delayed_put_task_struct+0x34/0x7c() Hardware name: IBM System x3550 -[79787AA]- Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 cpufreq_ondemand acpi_cpufreq freq_table dm_mirror dm_multipath scsi_dh sbs sbshc battery acpi_memhotplug ac parport_pc lp parport sg sr_mod bnx2 ide_cd_mod cdrom button rtc_cmos rtc_core rtc_lib i2c_i801 i2c_core i5000_edac edac_core pcspkr dm_region_hash dm_log dm_mod usb_storage ata_piix libata shpchp aacraid sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 0, comm: swapper Tainted: G W 2.6.30-rc5-tip #12 Call Trace: <IRQ> [<ffffffff8023eab8>] ? warn_slowpath_fmt+0xd9/0xf6 [<ffffffff80351233>] ? cpumask_next_and+0x2a/0x3b [<ffffffff80233de3>] ? find_busiest_group+0x1e6/0x82a [<ffffffff80230df4>] ? update_curr+0x6f/0xab [<ffffffff802303b4>] ? wakeup_preempt_entity+0xb5/0xc2 [<ffffffff8023b994>] ? check_preempt_wakeup+0x118/0x135 [<ffffffff802535fc>] ? lock_hrtimer_base+0x1b/0x3c [<ffffffff8025359d>] ? hrtimer_init+0x18/0x5c [<ffffffff80288d1a>] ? task_clock_perf_counter_enable+0x29/0x57 [<ffffffff80287eec>] ? perf_swcounter_enable+0x5/0x8 [<ffffffff80287a19>] ? counter_sched_in+0x26/0x8e [<ffffffff8028804d>] ? group_sched_in+0x54/0xe2 [<ffffffff80256ec6>] ? getnstimeofday+0x56/0xb5 [<ffffffff80240193>] ? delayed_put_task_struct+0x34/0x7c [<ffffffff80272af6>] ? __rcu_process_callbacks+0x101/0x1ac [<ffffffff80272bc7>] ? rcu_process_callbacks+0x26/0x4a [<ffffffff80243eed>] ? __do_softirq+0xa3/0x162 [<ffffffff8020ca3c>] ? call_softirq+0x1c/0x28 [<ffffffff8020dcc6>] ? do_softirq+0x2c/0x68 [<ffffffff8021e7b8>] ? smp_apic_timer_interrupt+0x8d/0x9d [<ffffffff8020c413>] ? apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff80211ed0>] ? mwait_idle+0xa3/0xd1 [<ffffffff804cd309>] ? notifier_call_chain+0x29/0x4c [<ffffffff8020aa1d>] ? cpu_idle+0x40/0x5e ---[ end trace 026134fcfb5c588e ]--- - vatsa ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-13 16:54 perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); Srivatsa Vaddagiri @ 2009-05-13 16:57 ` Srivatsa Vaddagiri 2009-05-15 13:56 ` Ingo Molnar 0 siblings, 1 reply; 10+ messages in thread From: Srivatsa Vaddagiri @ 2009-05-13 16:57 UTC (permalink / raw) To: Ingo Molnar, a.p.zijlstra; +Cc: linux-kernel On Wed, May 13, 2009 at 10:24:33PM +0530, Srivatsa Vaddagiri wrote: > I am hitting this warning in kernel/exit.c (latest -tip code): > > WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); > > I was using perf counters against a java benchmark. Basically using perf counters as: $ perf stat java .... - vatsa ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-13 16:57 ` Srivatsa Vaddagiri @ 2009-05-15 13:56 ` Ingo Molnar 2009-05-15 14:51 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 10+ messages in thread From: Ingo Molnar @ 2009-05-15 13:56 UTC (permalink / raw) To: Srivatsa Vaddagiri Cc: a.p.zijlstra, linux-kernel, Mike Galbraith, Thomas Gleixner, Arnaldo Carvalho de Melo, Paul Mackerras, Corey Ashford * Srivatsa Vaddagiri <vatsa@in.ibm.com> wrote: > On Wed, May 13, 2009 at 10:24:33PM +0530, Srivatsa Vaddagiri wrote: > > I am hitting this warning in kernel/exit.c (latest -tip code): > > > > WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); > > > > I was using perf counters against a java benchmark. > > Basically using perf counters as: > > $ perf stat java .... hm, is there a reproducer perhaps? Is there some class file i could run with specific parameters to reproduce it? Ingo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 13:56 ` Ingo Molnar @ 2009-05-15 14:51 ` Arnaldo Carvalho de Melo 2009-05-15 15:58 ` Srivatsa Vaddagiri 0 siblings, 1 reply; 10+ messages in thread From: Arnaldo Carvalho de Melo @ 2009-05-15 14:51 UTC (permalink / raw) To: Ingo Molnar Cc: Srivatsa Vaddagiri, a.p.zijlstra, linux-kernel, Mike Galbraith, Thomas Gleixner, Paul Mackerras, Corey Ashford Em Fri, May 15, 2009 at 03:56:04PM +0200, Ingo Molnar escreveu: > > * Srivatsa Vaddagiri <vatsa@in.ibm.com> wrote: > > > On Wed, May 13, 2009 at 10:24:33PM +0530, Srivatsa Vaddagiri wrote: > > > I am hitting this warning in kernel/exit.c (latest -tip code): > > > > > > WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); > > > > > > I was using perf counters against a java benchmark. > > > > Basically using perf counters as: > > > > $ perf stat java .... > > hm, is there a reproducer perhaps? Is there some class file i could > run with specific parameters to reproduce it? I'll try this with some java benchmarks we have, AMQP related, lets see if I can reproduce it. - Arnaldo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 14:51 ` Arnaldo Carvalho de Melo @ 2009-05-15 15:58 ` Srivatsa Vaddagiri 2009-05-15 16:13 ` Peter Zijlstra 0 siblings, 1 reply; 10+ messages in thread From: Srivatsa Vaddagiri @ 2009-05-15 15:58 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Ingo Molnar, a.p.zijlstra, linux-kernel, Mike Galbraith, Thomas Gleixner, Paul Mackerras, Corey Ashford On Fri, May 15, 2009 at 11:51:44AM -0300, Arnaldo Carvalho de Melo wrote: > > hm, is there a reproducer perhaps? Is there some class file i could > > run with specific parameters to reproduce it? > > I'll try this with some java benchmarks we have, AMQP related, lets see > if I can reproduce it. I tried this with SPECJbb - which I am not at liberty to distribute unfortunately. I will try and recreate it with volanomark or other open source benchmarks. - vatsa ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 15:58 ` Srivatsa Vaddagiri @ 2009-05-15 16:13 ` Peter Zijlstra 2009-05-15 17:37 ` Peter Zijlstra 0 siblings, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2009-05-15 16:13 UTC (permalink / raw) To: vatsa Cc: Arnaldo Carvalho de Melo, Ingo Molnar, linux-kernel, Mike Galbraith, Thomas Gleixner, Paul Mackerras, Corey Ashford On Fri, 2009-05-15 at 21:28 +0530, Srivatsa Vaddagiri wrote: > On Fri, May 15, 2009 at 11:51:44AM -0300, Arnaldo Carvalho de Melo wrote: > > > hm, is there a reproducer perhaps? Is there some class file i could > > > run with specific parameters to reproduce it? > > > > I'll try this with some java benchmarks we have, AMQP related, lets see > > if I can reproduce it. > > I tried this with SPECJbb - which I am not at liberty to distribute > unfortunately. I will try and recreate it with volanomark or other open > source benchmarks. I could indeed reproduce with vmark. Am poking at it.. still clueless though ;-) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 16:13 ` Peter Zijlstra @ 2009-05-15 17:37 ` Peter Zijlstra 2009-05-15 20:27 ` Peter Zijlstra 0 siblings, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2009-05-15 17:37 UTC (permalink / raw) To: vatsa Cc: Arnaldo Carvalho de Melo, Ingo Molnar, linux-kernel, Mike Galbraith, Thomas Gleixner, Paul Mackerras, Corey Ashford, Oleg Nesterov On Fri, 2009-05-15 at 18:13 +0200, Peter Zijlstra wrote: > On Fri, 2009-05-15 at 21:28 +0530, Srivatsa Vaddagiri wrote: > > On Fri, May 15, 2009 at 11:51:44AM -0300, Arnaldo Carvalho de Melo wrote: > > > > hm, is there a reproducer perhaps? Is there some class file i could > > > > run with specific parameters to reproduce it? > > > > > > I'll try this with some java benchmarks we have, AMQP related, lets see > > > if I can reproduce it. > > > > I tried this with SPECJbb - which I am not at liberty to distribute > > unfortunately. I will try and recreate it with volanomark or other open > > source benchmarks. > > I could indeed reproduce with vmark. Am poking at it.. still clueless > though ;-) [root@opteron tmp]# cat foo.c #include <pthread.h> #include <unistd.h> void *thread(void *arg) { sleep(5); return NULL; } void main(void) { pthread_t thr; pthread_create(&thr, NULL, thread, NULL); } The above instantly triggers it. It appears we fail to cleanup on the reparent path. I'll go root around in exit.c. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 17:37 ` Peter Zijlstra @ 2009-05-15 20:27 ` Peter Zijlstra 2009-05-18 4:45 ` Paul Mackerras 2009-05-22 6:25 ` Srivatsa Vaddagiri 0 siblings, 2 replies; 10+ messages in thread From: Peter Zijlstra @ 2009-05-15 20:27 UTC (permalink / raw) To: vatsa Cc: Arnaldo Carvalho de Melo, Ingo Molnar, linux-kernel, Mike Galbraith, Thomas Gleixner, Paul Mackerras, Corey Ashford, Oleg Nesterov On Fri, 2009-05-15 at 19:37 +0200, Peter Zijlstra wrote: > On Fri, 2009-05-15 at 18:13 +0200, Peter Zijlstra wrote: > > On Fri, 2009-05-15 at 21:28 +0530, Srivatsa Vaddagiri wrote: > > > On Fri, May 15, 2009 at 11:51:44AM -0300, Arnaldo Carvalho de Melo wrote: > > > > > hm, is there a reproducer perhaps? Is there some class file i could > > > > > run with specific parameters to reproduce it? > > > > > > > > I'll try this with some java benchmarks we have, AMQP related, lets see > > > > if I can reproduce it. > > > > > > I tried this with SPECJbb - which I am not at liberty to distribute > > > unfortunately. I will try and recreate it with volanomark or other open > > > source benchmarks. > > > > I could indeed reproduce with vmark. Am poking at it.. still clueless > > though ;-) > > [root@opteron tmp]# cat foo.c > #include <pthread.h> > #include <unistd.h> > > void *thread(void *arg) > { > sleep(5); > return NULL; > } > > void main(void) > { > pthread_t thr; > pthread_create(&thr, NULL, thread, NULL); > } > > The above instantly triggers it. It appears we fail to cleanup on the > reparent path. I'll go root around in exit.c. OK, so the cleanup isn't solid.. I've been poking at things, and below is the current state of my tinkering, but it seems to make things worse... With only the callback in do_exit() the above test works but hackbench fails, with only the call in wait_task_zombie() hackbench works and the above fails. With both, we segfault the kernel on a list op on either :-) Will continue poking tomorrow and such, unless someone beats me to it. --- Index: linux-2.6/kernel/perf_counter.c =================================================================== --- linux-2.6.orig/kernel/perf_counter.c +++ linux-2.6/kernel/perf_counter.c @@ -115,6 +115,7 @@ list_add_counter(struct perf_counter *co } list_add_rcu(&counter->event_entry, &ctx->event_list); + ctx->nr_counters++; } static void @@ -122,6 +123,8 @@ list_del_counter(struct perf_counter *co { struct perf_counter *sibling, *tmp; + ctx->nr_counters--; + list_del_init(&counter->list_entry); list_del_rcu(&counter->event_entry); @@ -209,7 +212,6 @@ static void __perf_counter_remove_from_c counter_sched_out(counter, cpuctx, ctx); counter->task = NULL; - ctx->nr_counters--; /* * Protect the list operation against NMI by disabling the @@ -276,7 +278,6 @@ retry: * succeed. */ if (!list_empty(&counter->list_entry)) { - ctx->nr_counters--; list_del_counter(counter, ctx); counter->task = NULL; } @@ -544,7 +545,6 @@ static void add_counter_to_ctx(struct pe struct perf_counter_context *ctx) { list_add_counter(counter, ctx); - ctx->nr_counters++; counter->prev_state = PERF_COUNTER_STATE_OFF; counter->tstamp_enabled = ctx->time; counter->tstamp_running = ctx->time; @@ -3252,8 +3252,8 @@ __perf_counter_exit_task(struct task_str */ if (child != current) { wait_task_inactive(child, 0); - list_del_init(&child_counter->list_entry); update_counter_times(child_counter); + list_del_counter(child_counter, child_ctx); } else { struct perf_cpu_context *cpuctx; unsigned long flags; @@ -3272,9 +3272,7 @@ __perf_counter_exit_task(struct task_str group_sched_out(child_counter, cpuctx, child_ctx); update_counter_times(child_counter); - list_del_init(&child_counter->list_entry); - - child_ctx->nr_counters--; + list_del_counter(child_counter, child_ctx); perf_enable(); local_irq_restore(flags); @@ -3288,13 +3286,6 @@ __perf_counter_exit_task(struct task_str */ if (parent_counter) { sync_child_counter(child_counter, parent_counter); - list_for_each_entry_safe(sub, tmp, &child_counter->sibling_list, - list_entry) { - if (sub->parent) { - sync_child_counter(sub, sub->parent); - free_counter(sub); - } - } free_counter(child_counter); } } @@ -3315,9 +3306,18 @@ void perf_counter_exit_task(struct task_ if (likely(!child_ctx->nr_counters)) return; +again: list_for_each_entry_safe(child_counter, tmp, &child_ctx->counter_list, list_entry) __perf_counter_exit_task(child, child_counter, child_ctx); + + /* + * If the last counter was a group counter, it will have appended all + * its siblings to the list, but we obtained 'tmp' before that which + * will still point to the list head terminating the iteration. + */ + if (!list_empty(&child_ctx->counter_list)) + goto again; } /* Index: linux-2.6/kernel/exit.c =================================================================== --- linux-2.6.orig/kernel/exit.c +++ linux-2.6/kernel/exit.c @@ -155,7 +155,7 @@ static void delayed_put_task_struct(stru struct task_struct *tsk = container_of(rhp, struct task_struct, rcu); #ifdef CONFIG_PERF_COUNTERS - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); + WARN_ON(!list_empty(&tsk->perf_counter_ctx.counter_list)); #endif trace_sched_process_free(tsk); put_task_struct(tsk); @@ -1002,6 +1002,8 @@ NORET_TYPE void do_exit(long code) if (tsk->splice_pipe) __free_pipe_info(tsk->splice_pipe); + perf_counter_exit_task(tsk); + preempt_disable(); /* causes final put_task_struct in finish_task_switch(). */ tsk->state = TASK_DEAD; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 20:27 ` Peter Zijlstra @ 2009-05-18 4:45 ` Paul Mackerras 2009-05-22 6:25 ` Srivatsa Vaddagiri 1 sibling, 0 replies; 10+ messages in thread From: Paul Mackerras @ 2009-05-18 4:45 UTC (permalink / raw) To: Peter Zijlstra Cc: vatsa, Arnaldo Carvalho de Melo, Ingo Molnar, linux-kernel, Mike Galbraith, Thomas Gleixner, Corey Ashford, Oleg Nesterov Peter Zijlstra writes: > OK, so the cleanup isn't solid.. I've been poking at things, and below > is the current state of my tinkering, but it seems to make things > worse... > > With only the callback in do_exit() the above test works but hackbench > fails, with only the call in wait_task_zombie() hackbench works and the > above fails. > > With both, we segfault the kernel on a list op on either :-) I don't know if this is the problem, but I have noticed a basic lifetime issue: a counter on a task points to a context which is embedded in the task_struct of the task being counted, but the counter might outlive the task. For example, task A puts a counter on task B, task B dies and is reaped by its parent, but the counter still exists because task A hasn't closed its fd. When task A does close the fd, perf_release will call perf_counter_remove_from_context which will go and use counter->ctx, but that is in B's task struct, which has gone away. I want to change the task struct to have just a pointer to the context rather than the context struct itself for other reasons (it will make it much easier to implement lazy PMU switching). If we do that we could refcount the context and solve the lifetime issue that way. I'm working on a patch; hopefully I'll have more to report later today. Paul. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); 2009-05-15 20:27 ` Peter Zijlstra 2009-05-18 4:45 ` Paul Mackerras @ 2009-05-22 6:25 ` Srivatsa Vaddagiri 1 sibling, 0 replies; 10+ messages in thread From: Srivatsa Vaddagiri @ 2009-05-22 6:25 UTC (permalink / raw) To: Peter Zijlstra Cc: Arnaldo Carvalho de Melo, Ingo Molnar, linux-kernel, Mike Galbraith, Thomas Gleixner, Paul Mackerras, Corey Ashford, Oleg Nesterov On Fri, May 15, 2009 at 10:27:08PM +0200, Peter Zijlstra wrote: > OK, so the cleanup isn't solid.. I've been poking at things, and below > is the current state of my tinkering, but it seems to make things > worse... I got around to test the latest tip today and my problem has disappeared. Thanks! - vatsa ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2009-05-22 6:26 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-13 16:54 perf counter issue - WARN_ON_ONCE(!list_empty(&tsk->perf_counter_ctx.counter_list)); Srivatsa Vaddagiri 2009-05-13 16:57 ` Srivatsa Vaddagiri 2009-05-15 13:56 ` Ingo Molnar 2009-05-15 14:51 ` Arnaldo Carvalho de Melo 2009-05-15 15:58 ` Srivatsa Vaddagiri 2009-05-15 16:13 ` Peter Zijlstra 2009-05-15 17:37 ` Peter Zijlstra 2009-05-15 20:27 ` Peter Zijlstra 2009-05-18 4:45 ` Paul Mackerras 2009-05-22 6:25 ` Srivatsa Vaddagiri
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox