From: Clark Williams <williams@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Ingo Molnar <mingo@elte.hu>
Cc: RT <linux-rt-users@vger.kernel.org>
Subject: perf backtrace with 3.0.1-rt9
Date: Fri, 12 Aug 2011 14:39:30 -0500 [thread overview]
Message-ID: <20110812143930.5cd2d5b8@redhat.com> (raw)
[-- Attachment #1.1: Type: text/plain, Size: 767 bytes --]
Peter (et al),
Attached is another backtrace while running perf, this time on
3.0.1-rt9:
$ sudo perf record rteval --duration=30m
The rteval python script is really just an evolution of Ingo's old
'dohell' script. Kicks off cyclictest with a measurement thread per
core, then runs an endless kernel make -j<n*2> (where n is the number of
cpu cores) and a large number of hackbench invocations. You can get it
at:
git://git.kernel.org/pub/scm/linux/kernel/git/clrkwllms/rteval.git
While the perf record was going, I kicked off perf top in another
window. Took about 10-15min on a quad-core Opteron box. Got three
warnings and then things went seriously downhill from there.
Let me know if I can get you any more information.
Clark
[-- Attachment #1.2: splat.txt --]
[-- Type: text/plain, Size: 13204 bytes --]
WARNING: at kernel/events/core.c:1976 task_ctx_sched_out+0x57/0x7d()
Hardware name: empty
Modules linked in: nfs lockd auth_rpcgss nfs_acl sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log lm87 hwmon_vid hwmon microcode serio_raw pcspkr edac_core sfc mtd i2c_algo_bit mdio forcedeth sg i2c_nforce2 i2c_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom sata_nv pata_acpi ata_generic pata_amd libata ehci_hcd ohci_hcd button [last unloaded: scsi_wait_scan]
Pid: 21620, comm: sh Not tainted 3.0.1-rt9.7.el6rt.x86_64 #1
Call Trace:
[<ffffffff81044672>] warn_slowpath_common+0x85/0x9d
[<ffffffff810446a4>] warn_slowpath_null+0x1a/0x1c
[<ffffffff810be49a>] task_ctx_sched_out+0x57/0x7d
[<ffffffff810c5283>] ? perf_event_comm+0x46/0x35b
[<ffffffff810c52bc>] perf_event_comm+0x7f/0x35b
[<ffffffff81039480>] ? get_parent_ip+0x11/0x41
[<ffffffff81039480>] ? get_parent_ip+0x11/0x41
[<ffffffff813b8837>] ? sub_preempt_count+0x97/0xab
[<ffffffff8111704c>] set_task_comm+0x51/0x58
[<ffffffff81117136>] setup_new_exec+0xe3/0x294
[<ffffffff81151532>] load_elf_binary+0x330/0x171a
[<ffffffff8103f812>] ? migrate_enable+0x138/0x14a
[<ffffffff8114edb6>] ? load_misc_binary+0x17c/0x366
[<ffffffff813b8837>] ? sub_preempt_count+0x97/0xab
[<ffffffff8103532e>] ? need_resched+0x23/0x2d
[<ffffffff813b4170>] ? preempt_schedule_irq+0x81/0x87
[<ffffffff813b5c06>] ? retint_kernel+0x26/0x30
[<ffffffff81039480>] ? get_parent_ip+0x11/0x41
[<ffffffff813b8837>] ? sub_preempt_count+0x97/0xab
[<ffffffff81117d5d>] search_binary_handler+0xb2/0x256
[<ffffffff81151202>] ? set_brk+0xed/0xed
[<ffffffff8111808a>] do_execve_common+0x189/0x2af
[<ffffffff8110573d>] ? kmem_cache_alloc+0xea/0x172
[<ffffffff81118244>] do_execve+0x43/0x45
[<ffffffff8100918d>] sys_execve+0x43/0x5a
[<ffffffff813bbd9c>] stub_execve+0x6c/0xc0
---[ end trace 0000000000000002 ]---
------------[ cut here ]------------
WARNING: at arch/x86/kernel/cpu/perf_event.c:1161 x86_pmu_start+0x5c/0x10c()
Hardware name: empty
Modules linked in: nfs lockd auth_rpcgss nfs_acl sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log lm87 hwmon_vid hwmon microcode serio_raw pcspkr edac_core sfc mtd i2c_algo_bit mdio forcedeth sg i2c_nforce2 i2c_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom sata_nv pata_acpi ata_generic pata_amd libata ehci_hcd ohci_hcd button [last unloaded: scsi_wait_scan]
Pid: 5644, comm: cyclictest Tainted: G W 3.0.1-rt9.7.el6rt.x86_64 #1
Call Trace:
<IRQ> [<ffffffff81044672>] warn_slowpath_common+0x85/0x9d
[<ffffffff810446a4>] warn_slowpath_null+0x1a/0x1c
[<ffffffff8100fc47>] x86_pmu_start+0x5c/0x10c
[<ffffffff8100fea8>] x86_pmu_enable+0x1b1/0x25a
[<ffffffff810bd851>] perf_pmu_enable+0x31/0x36
[<ffffffff810c4461>] perf_event_task_tick+0x1e4/0x21c
[<ffffffff8103a80c>] scheduler_tick+0x18b/0x1f2
[<ffffffff810523fd>] update_process_times+0x36/0x57
[<ffffffff810703ac>] tick_sched_timer+0xfc/0x125
[<ffffffff8106543e>] __run_hrtimer+0xc0/0x15f
[<ffffffff810702b0>] ? tick_setup_sched_timer+0xdc/0xdc
[<ffffffff81065771>] hrtimer_interrupt+0xeb/0x1d6
[<ffffffff813b88fd>] ? add_preempt_count+0xb2/0xb7
[<ffffffff813bd338>] smp_apic_timer_interrupt+0x85/0x98
[<ffffffff813bc313>] apic_timer_interrupt+0x13/0x20
<EOI> [<ffffffff813bba74>] ? sysret_audit+0x16/0x20
---[ end trace 0000000000000003 ]---
------------[ cut here ]------------
WARNING: at arch/x86/kernel/cpu/perf_event.c:1243 x86_pmu_stop+0x79/0xae()
Hardware name: empty
Modules linked in: nfs lockd auth_rpcgss nfs_acl sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log lm87 hwmon_vid hwmon microcode serio_raw pcspkr edac_core sfc mtd i2c_algo_bit mdio forcedeth sg i2c_nforce2 i2c_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom sata_nv pata_acpi ata_generic pata_amd libata ehci_hcd ohci_hcd button [last unloaded: scsi_wait_scan]
Pid: 22341, comm: hackbench Tainted: G W 3.0.1-rt9.7.el6rt.x86_64 #1
Call Trace:
[<ffffffff81044672>] warn_slowpath_common+0x85/0x9d
[<ffffffff810446a4>] warn_slowpath_null+0x1a/0x1c
[<ffffffff8100f752>] x86_pmu_stop+0x79/0xae
[<ffffffff8100ff94>] x86_pmu_del+0x43/0xbe
[<ffffffff810bd6cd>] event_sched_out+0x95/0xde
[<ffffffff810bd739>] group_sched_out+0x23/0x71
[<ffffffff810be41b>] ctx_sched_out+0x93/0xbb
[<ffffffff810c04fe>] __perf_event_task_sched_out+0x2b7/0x2f7
[<ffffffff810c051c>] ? __perf_event_task_sched_out+0x2d5/0x2f7
[<ffffffff8103d3bd>] perf_event_task_sched_out+0x59/0x67
[<ffffffff810356a3>] ? cpuacct_charge+0x5b/0x64
[<ffffffff811cd428>] ? rb_insert_color+0x68/0xe5
[<ffffffff810385e5>] ? pick_next_task_rt+0x8c/0xb3
[<ffffffff813b3920>] __schedule+0x3f7/0x58e
[<ffffffff81039480>] ? get_parent_ip+0x11/0x41
[<ffffffff813b414c>] preempt_schedule_irq+0x5d/0x87
[<ffffffff813b5c06>] retint_kernel+0x26/0x30
[<ffffffff81102e5f>] ? free_block+0x5a/0x10f
[<ffffffff81102e42>] ? free_block+0x3d/0x10f
[<ffffffff8110334d>] __cache_free+0x180/0x1c6
[<ffffffff813b535e>] ? rt_spin_lock+0x24/0x26
[<ffffffff81102d48>] kmem_cache_free+0x6e/0xc7
[<ffffffff81303933>] __kfree_skb+0x7d/0x82
[<ffffffff813039af>] consume_skb+0x77/0x7c
[<ffffffff8138d1ca>] unix_stream_recvmsg+0x594/0x63f
[<ffffffff812f9049>] __sock_recvmsg_nosec+0x6a/0x73
[<ffffffff812f9a3f>] __sock_recvmsg+0x4e/0x59
[<ffffffff812f9b3c>] sock_aio_read+0xf2/0x106
[<ffffffff8118cddd>] ? file_has_perm+0xa7/0xc9
[<ffffffff8111107d>] do_sync_read+0xcb/0x108
[<ffffffff8118a47a>] ? fsnotify_perm+0x66/0x72
[<ffffffff8118a4e7>] ? security_file_permission+0x2e/0x33
[<ffffffff81111b21>] vfs_read+0xbe/0x107
[<ffffffff81103aa9>] ? kfree+0xcf/0xd5
[<ffffffff81111c2d>] sys_read+0x4a/0x71
[<ffffffff813bb942>] system_call_fastpath+0x16/0x1b
---[ end trace 0000000000000004 ]---
Uhhuh. NMI received for unknown reason 2d on CPU 2.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue
general protection fault: 0000 [#1] PREEMPT SMP
CPU 0
Modules linked in: nfs lockd auth_rpcgss nfs_acl sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log lm87 hwmon_vid hwmon microcode serio_raw pcspkr edac_core sfc mtd i2c_algo_bit mdio forcedeth sg i2c_nforce2 i2c_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom sata_nv pata_acpi ata_generic pata_amd libata ehci_hcd ohci_hcd button [last unloaded: scsi_wait_scan]
Pid: 23657, comm: hackbench Tainted: G W 3.0.1-rt9.7.el6rt.x86_64 #1 empty empty/TYAN Transport GT20-B2925/B2925-E
RIP: 0010:[<ffffffff810c0177>] [<ffffffff810c0177>] perf_event_update_userpage+0x24/0xcf
RSP: 0000:ffff88011fc06d58 EFLAGS: 00010006
RAX: 2d2d2d2d2d2d2d2d RBX: ffff8800967a4c00 RCX: 00000000c0010005
RDX: 000000000000ffff RSI: 00000000a9003a01 RDI: 0000000000000001
RBP: ffff88011fc06d68 R08: 0000000000000000 R09: 0000000000000002
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: ffffffffa9003a01 R14: 00000000a9003a01 R15: ffff88011fc09838
FS: 00007fbf34a31700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fbf34a323dc CR3: 000000009bba6000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process hackbench (pid: 23657, threadinfo ffff8800730b6000, task ffff8800ac5a8740)
Stack:
ffff8800967a4c00 0000000000000001 ffff88011fc06d98 ffffffff8100fbdf
0000000000000000 ffff88011fc09630 0000000000000001 ffff8800967a4c00
ffff88011fc06e48 ffffffff8101197b ffff88011fc06f58 ffff88011fc09830
Call Trace:
<NMI>
[<ffffffff8100fbdf>] x86_perf_event_set_period+0x125/0x131
[<ffffffff8101197b>] x86_pmu_handle_irq+0xc8/0x122
[<ffffffff810943ba>] ? audit_syscall_exit+0xc7/0x14c
[<ffffffff813b6e34>] perf_event_nmi_handler+0x39/0x81
[<ffffffff813b8939>] notifier_call_chain+0x37/0x63
[<ffffffff813b89a2>] __atomic_notifier_call_chain+0x3d/0x4f
[<ffffffff813b89c8>] atomic_notifier_call_chain+0x14/0x16
[<ffffffff813b89f8>] notify_die+0x2e/0x30
[<ffffffff813b6502>] do_nmi+0x6c/0x215
[<ffffffff813b5fd0>] nmi+0x20/0x30
<<EOE>>
Code: e8 ff b9 fd ff c9 c3 55 48 89 e5 41 54 53 48 89 fb e8 8f ae fd ff 48 8b 83 40 02 00 00 48 85 c0 0f 84 a6 00 00 00 bf 01 00 00 00 <4c> 8b 60 58 e8 cb 86 2f 00 41 ff 44 24 08 f6 83 48 01 00 00 01
RIP [<ffffffff810c0177>] perf_event_update_userpage+0x24/0xcf
RSP <ffff88011fc06d58>
---[ end trace 0000000000000005 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 23657, comm: hackbench Tainted: G D W 3.0.1-rt9.7.el6rt.x86_64 #1
Call Trace:
<NMI> [<ffffffff813b32df>] panic+0x9b/0x1b2
[<ffffffff813b686b>] oops_end+0xd8/0xe8
[<ffffffff81004fdb>] die+0x5a/0x63
[<ffffffff813b648e>] do_general_protection+0x143/0x14b
[<ffffffff813b5ce5>] general_protection+0x25/0x30
[<ffffffff810c0177>] ? perf_event_update_userpage+0x24/0xcf
[<ffffffff810c0162>] ? perf_event_update_userpage+0xf/0xcf
[<ffffffff8100fbdf>] x86_perf_event_set_period+0x125/0x131
[<ffffffff8101197b>] x86_pmu_handle_irq+0xc8/0x122
[<ffffffff810943ba>] ? audit_syscall_exit+0xc7/0x14c
[<ffffffff813b6e34>] perf_event_nmi_handler+0x39/0x81
[<ffffffff813b8939>] notifier_call_chain+0x37/0x63
[<ffffffff813b89a2>] __atomic_notifier_call_chain+0x3d/0x4f
[<ffffffff813b89c8>] atomic_notifier_call_chain+0x14/0x16
[<ffffffff813b89f8>] notify_die+0x2e/0x30
[<ffffffff813b6502>] do_nmi+0x6c/0x215
[<ffffffff813b5fd0>] nmi+0x20/0x30
<<EOE>>
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/traps.c:436!
invalid opcode: 0000 [#2] PREEMPT SMP
CPU 0
Modules linked in: nfs lockd auth_rpcgss nfs_acl sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 xt_state ip6table_filter ip6_tables x_tables ipv6 dm_mirror dm_region_hash dm_log lm87 hwmon_vid hwmon microcode serio_raw pcspkr edac_core sfc mtd i2c_algo_bit mdio forcedeth sg i2c_nforce2 i2c_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom sata_nv pata_acpi ata_generic pata_amd libata ehci_hcd ohci_hcd button [last unloaded: scsi_wait_scan]
Pid: 23657, comm: hackbench Tainted: G D W 3.0.1-rt9.7.el6rt.x86_64 #1 empty empty/TYAN Transport GT20-B2925/B2925-E
RIP: 0010:[<ffffffff813b64bd>] [<ffffffff813b64bd>] do_nmi+0x27/0x215
RSP: 0000:ffff88011fc06f28 EFLAGS: 00010002
RAX: ffff8800730b7fd8 RBX: 0000000000000001 RCX: 00000000c0000101
RDX: 00000000ffff8801 RSI: ffffffffffffffff RDI: ffff88011fc06f58
RBP: ffff88011fc06f48 R08: 0000000000000000 R09: 000000000000eb9d
R10: ffff88011fc06b78 R11: 0000000000000078 R12: ffff88011fc06f58
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88011fc09838
FS: 00007fbf34a31700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fbf34a323dc CR3: 000000009bba6000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process hackbench (pid: 23657, threadinfo ffff8800730b6000, task ffff8800ac5a8740)
Stack:
0000000000000000 0000000000000001 0000000000000082 0000000000000000
ffff88011fc06c08 ffffffff813b5fd0 ffff88011fc09838 0000000000000000
0000000000000000 0000000000000082 ffff88011fc06c08 0000000000000000
Call Trace:
<NMI>
[<ffffffff813b5fd0>] nmi+0x20/0x30
[<ffffffff813b33b4>] ? panic+0x170/0x1b2
<<EOE>>
Code: 41 5d c9 c3 55 48 89 e5 41 55 41 54 49 89 fc 53 48 83 ec 08 e8 d3 88 c6 ff 65 48 8b 04 25 c8 95 00 00 f6 80 47 e0 ff ff 04 74 04 <0f> 0b eb fe bf 00 00 01 04 e8 80 23 00 00 65 ff 04 25 c4 ec 00
RIP [<ffffffff813b64bd>] do_nmi+0x27/0x215
RSP <ffff88011fc06f28>
---[ end trace 0000000000000006 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 23657, comm: hackbench Tainted: G D W 3.0.1-rt9.7.el6rt.x86_64 #1
Call Trace:
<NMI> [<ffffffff813b32df>] panic+0x9b/0x1b2
[<ffffffff813b686b>] oops_end+0xd8/0xe8
[<ffffffff81004fdb>] die+0x5a/0x63
[<ffffffff813b62cc>] do_trap+0x121/0x130
[<ffffffff8100330b>] do_invalid_op+0xaa/0xb3
[<ffffffff813b64bd>] ? do_nmi+0x27/0x215
[<ffffffff813b6e34>] ? perf_event_nmi_handler+0x39/0x81
[<ffffffff813bc8db>] invalid_op+0x1b/0x20
[<ffffffff813b64bd>] ? do_nmi+0x27/0x215
[<ffffffff813b64ab>] ? do_nmi+0x15/0x215
[<ffffffff813b5fd0>] nmi+0x20/0x30
[<ffffffff813b33b4>] ? panic+0x170/0x1b2
<<EOE>>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
reply other threads:[~2011-08-12 19:39 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110812143930.5cd2d5b8@redhat.com \
--to=williams@redhat.com \
--cc=acme@redhat.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.