2.6.33: ftrace triggers soft lockup

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* 2.6.33: ftrace triggers soft lockup
@ 2010-03-03  6:04 Américo Wang
  2010-03-03 13:05 ` Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Américo Wang @ 2010-03-03  6:04 UTC (permalink / raw)
  To: LKML; +Cc: Steven Rostedt, Frederic Weisbecker, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 1913 bytes --]

Hi, Steven, Frederic, Ingo,

I am not sure if this is ftrace's fault, but it is ftrace who triggers
the soft lockup. On my machine, it is pretty easy, just run:

echo function_graph > current_tracer

I found this when 2.6.33-rc7, but forgot to report here. I just tried today's
Linus tree, it still has this problem.

Attached are the full soft lockup warnings.

Below is my related configs:

CONFIG_STACKTRACE_SUPPORT=y
# CONFIG_RCU_TRACE is not set
# CONFIG_TREE_RCU_TRACE is not set
CONFIG_TRACEPOINTS=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
# CONFIG_NF_CONNTRACK is not set
# CONFIG_NETFILTER_XT_TARGET_TRACE is not set
CONFIG_MOUSE_PS2_TRACKPOINT=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_TRACE_IRQFLAGS=y
CONFIG_STACKTRACE=y
# CONFIG_BACKTRACE_SELF_TEST is not set
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FTRACE_NMI_ENTER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_FTRACE_NMI_ENTER=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_SYSPROF_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_FTRACE_SYSCALLS is not set
# CONFIG_BOOT_TRACER is not set
# CONFIG_KSYM_TRACER is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_KMEMTRACE is not set
# CONFIG_WORKQUEUE_TRACER is not set
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_FTRACE_MCOUNT_RECORD=y
# CONFIG_FTRACE_STARTUP_TEST is not set
# CONFIG_MMIOTRACE is not set
CONFIG_SAMPLE_TRACEPOINTS=m
# CONFIG_SAMPLE_TRACE_EVENTS is not set
CONFIG_HAVE_MMIOTRACE_SUPPORT=y

Sorry that I don't have time to look into this. If you need other info,
just say.

Thanks!

[-- Attachment #2: ftrace-soft-lockup.txt --]
[-- Type: text/plain, Size: 12765 bytes --]

Mar  3 13:34:37 dhcp-66-70-5 kernel: BUG: soft lockup - CPU#3 stuck for 61s! [kstop/3:4876]
Mar  3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s
Mar  3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables]
Mar  3 13:34:37 dhcp-66-70-5 kernel: irq event stamp: 51397
Mar  3 13:34:37 dhcp-66-70-5 kernel: hardirqs last  enabled at (51396): [<ffffffff8155287c>] restore_args+0x0/0x30
Mar  3 13:34:37 dhcp-66-70-5 kernel: hardirqs last disabled at (51397): [<ffffffff81003917>] save_args+0x67/0x70
Mar  3 13:34:37 dhcp-66-70-5 kernel: softirqs last  enabled at (6536): [<ffffffff81003881>] return_to_handler+0x0/0x2f
Mar  3 13:34:37 dhcp-66-70-5 kernel: softirqs last disabled at (6567): [<ffffffff81003881>] return_to_handler+0x0/0x2f
Mar  3 13:34:37 dhcp-66-70-5 kernel: CPU 3 
Mar  3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s
Mar  3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables]
Mar  3 13:34:37 dhcp-66-70-5 kernel: 
Mar  3 13:34:38 dhcp-66-70-5 kernel: Pid: 4876, comm: kstop/3 Not tainted 2.6.33 #31 0J001K/PowerEdge T105 
Mar  3 13:34:38 dhcp-66-70-5 kernel: RIP: 0010:[<ffffffff8100d3d7>]  [<ffffffff8100d3d7>] native_sched_clock+0x2a/0x69
Mar  3 13:34:38 dhcp-66-70-5 kernel: RSP: 0018:ffff88002fe03da0  EFLAGS: 00000246
Mar  3 13:34:38 dhcp-66-70-5 kernel: RAX: 00000000b24e3e78 RBX: ffff88002fe03da0 RCX: ffff88002fe03ee0
Mar  3 13:34:39 dhcp-66-70-5 kernel: RDX: 0000000000000053 RSI: ffffffff810dfde6 RDI: ffffffff810e138d
Mar  3 13:34:41 dhcp-66-70-5 kernel: RBP: ffffffff81003881 R08: 0000000000000110 R09: ffffffff81003896
Mar  3 13:34:41 dhcp-66-70-5 kernel: R10: ffff88021697dfd8 R11: 0000000000000000 R12: ffff88002fe03d20
Mar  3 13:34:41 dhcp-66-70-5 kernel: R13: ffffffff810e138d R14: ffff8802287a6680 R15: ffffffff81003881
Mar  3 13:34:41 dhcp-66-70-5 kernel: FS:  00007fd9db23a720(0000) GS:ffff88002fe00000(0000) knlGS:0000000000000000
Mar  3 13:34:41 dhcp-66-70-5 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Mar  3 13:34:41 dhcp-66-70-5 kernel: CR2: 00000000008e42c0 CR3: 0000000219d90000 CR4: 00000000000006e0
Mar  3 13:34:41 dhcp-66-70-5 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar  3 13:34:41 dhcp-66-70-5 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Mar  3 13:34:41 dhcp-66-70-5 kernel: Process kstop/3 (pid: 4876, threadinfo ffff88021697c000, task ffff8802287a6680)
Mar  3 13:34:41 dhcp-66-70-5 kernel: Stack:
Mar  3 13:34:41 dhcp-66-70-5 kernel:  ffff88002fe03db0 ffffffff810e7407 ffff88002fe03df0 ffffffff810ff331
Mar  3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002ffceec0 ffff88002fe03e98 ffffffff810e138d ffff8802287a6680
Mar  3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002fe03e00 ffffffff810dfde6 ffff88002fe03e40 ffffffff8102cd97
Mar  3 13:34:41 dhcp-66-70-5 kernel: Call Trace:
Mar  3 13:34:41 dhcp-66-70-5 kernel:  <IRQ> 
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810e7407>] ? trace_clock_local+0x2c/0x43
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810ff331>] ? ftrace_push_return_trace+0xac/0x17a
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810e138d>] ? __rcu_process_callbacks+0x1d4/0x4a9
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810dfde6>] ? cpu_needs_another_gp+0x4/0x46
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8102cd97>] ? prepare_ftrace_return+0xda/0x138
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003896>] ? return_to_handler+0x15/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810dfdeb>] ? cpu_needs_another_gp+0x9/0x46
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81008d3c>] print_context_stack+0xb6/0x144
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81007375>] dump_trace+0x320/0x3bf
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81008979>] show_trace_log_lvl+0x5f/0x72
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100703f>] show_stack_log_lvl+0x185/0x19b
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81006d50>] ? do_softirq+0x8c/0x181
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81007516>] show_registers+0x102/0x3ac
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100e739>] show_regs+0x1d/0x53
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810da649>] softlockup_tick+0x268/0x2d2
Mar  3 13:34:41 dhcp-66-70-5 kernel:  <EOI> 
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8102cce7>] ? prepare_ftrace_return+0x2a/0x138
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81066035>] ? do_exit+0x160/0xad0
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2f>] ? exit_signals+0xc/0x253
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81066044>] ? do_exit+0x16f/0xad0
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108ab21>] ? autoremove_wake_function+0x0/0x52
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81076327>] run_local_timers+0x32/0x3b
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81076547>] update_process_times+0x51/0xa5
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810048d4>] ? kernel_thread_helper+0x4/0x10
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8155287c>] ? restore_args+0x0/0x30
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108a4c9>] ? kthread+0x0/0xec
Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810048d0>] ? kernel_thread_helper+0x0/0x10
Mar  3 13:34:41 dhcp-66-70-5 kernel: Code: c3 83 3d b0 80 af 00 00 55 48 89 e5 74 1b 48 69 15 bb d4 af 00 40 42 0f 00 48 b8 00 b8 64 d9 05 be f0 ff 48 8d 34 02 eb 3c 0f 31 <89> c7 89 d6 48 c7 c2 d0 b4 00 00 65 8b 04 25 88 d2 00 00 48 98 
Mar  3 13:34:42 dhcp-66-70-5 kernel: Call Trace:
Mar  3 13:34:42 dhcp-66-70-5 kernel:  <IRQ>  [<ffffffff810e7407>] ? trace_clock_local+0x2c/0x43
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810ff331>] ? ftrace_push_return_trace+0xac/0x17a
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810e138d>] ? __rcu_process_callbacks+0x1d4/0x4a9
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810dfde6>] ? cpu_needs_another_gp+0x4/0x46
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8102cd97>] ? prepare_ftrace_return+0xda/0x138
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003896>] ? return_to_handler+0x15/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810dfdeb>] ? cpu_needs_another_gp+0x9/0x46
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81008d3c>] print_context_stack+0xb6/0x144
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81007375>] dump_trace+0x320/0x3bf
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81008979>] show_trace_log_lvl+0x5f/0x72
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810089a8>] show_trace+0x1c/0x25
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81006d50>] ? do_softirq+0x8c/0x181
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8100e763>] show_regs+0x47/0x53
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810da649>] softlockup_tick+0x268/0x2d2
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81076327>] run_local_timers+0x32/0x3b
Mar  3 13:34:42 dhcp-66-70-5 kernel:  <EOI>  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8102cce7>] ? prepare_ftrace_return+0x2a/0x138
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81066035>] ? do_exit+0x160/0xad0
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8107de2f>] ? exit_signals+0xc/0x253
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81066044>] ? do_exit+0x16f/0xad0
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8108ab21>] ? autoremove_wake_function+0x0/0x52
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81076547>] update_process_times+0x51/0xa5
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8109c974>] tick_periodic+0x99/0xb3
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810048d4>] ? kernel_thread_helper+0x4/0x10
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8155287c>] ? restore_args+0x0/0x30
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff8108a4c9>] ? kthread+0x0/0xec
Mar  3 13:34:42 dhcp-66-70-5 kernel:  [<ffffffff810048d0>] ? kernel_thread_helper+0x0/0x10


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-03  6:04 2.6.33: ftrace triggers soft lockup Américo Wang
@ 2010-03-03 13:05 ` Steven Rostedt
  2010-03-04  2:58   ` Américo Wang
  2010-03-04  1:33 ` Li Zefan
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-03 13:05 UTC (permalink / raw)
  To: Américo Wang; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
> Hi, Steven, Frederic, Ingo,
> 
> I am not sure if this is ftrace's fault, but it is ftrace who triggers
> the soft lockup. On my machine, it is pretty easy, just run:
> 
> echo function_graph > current_tracer
> 
> I found this when 2.6.33-rc7, but forgot to report here. I just tried today's
> Linus tree, it still has this problem.
> 
> Attached are the full soft lockup warnings.
> 
> Below is my related configs:

Actually this is not enough for me. Could you send me the full config,
as well as the full dmesg, not just the dump.

Thanks!

-- Steve

> 
> CONFIG_STACKTRACE_SUPPORT=y
> # CONFIG_RCU_TRACE is not set
> # CONFIG_TREE_RCU_TRACE is not set
> CONFIG_TRACEPOINTS=y
> CONFIG_HAVE_ARCH_TRACEHOOK=y
> # CONFIG_NF_CONNTRACK is not set
> # CONFIG_NETFILTER_XT_TARGET_TRACE is not set
> CONFIG_MOUSE_PS2_TRACKPOINT=y
> CONFIG_TRACE_IRQFLAGS_SUPPORT=y
> CONFIG_TRACE_IRQFLAGS=y
> CONFIG_STACKTRACE=y
> # CONFIG_BACKTRACE_SELF_TEST is not set
> CONFIG_USER_STACKTRACE_SUPPORT=y
> CONFIG_NOP_TRACER=y
> CONFIG_HAVE_FTRACE_NMI_ENTER=y
> CONFIG_HAVE_FUNCTION_TRACER=y
> CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
> CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y
> CONFIG_HAVE_DYNAMIC_FTRACE=y
> CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
> CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
> CONFIG_FTRACE_NMI_ENTER=y
> CONFIG_EVENT_TRACING=y
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_TRACING=y
> CONFIG_GENERIC_TRACER=y
> CONFIG_TRACING_SUPPORT=y
> CONFIG_FTRACE=y
> CONFIG_FUNCTION_TRACER=y
> CONFIG_FUNCTION_GRAPH_TRACER=y
> # CONFIG_IRQSOFF_TRACER is not set
> # CONFIG_SYSPROF_TRACER is not set
> # CONFIG_SCHED_TRACER is not set
> # CONFIG_FTRACE_SYSCALLS is not set
> # CONFIG_BOOT_TRACER is not set
> # CONFIG_KSYM_TRACER is not set
> # CONFIG_STACK_TRACER is not set
> # CONFIG_KMEMTRACE is not set
> # CONFIG_WORKQUEUE_TRACER is not set
> CONFIG_BLK_DEV_IO_TRACE=y
> CONFIG_DYNAMIC_FTRACE=y
> CONFIG_FTRACE_MCOUNT_RECORD=y
> # CONFIG_FTRACE_STARTUP_TEST is not set
> # CONFIG_MMIOTRACE is not set
> CONFIG_SAMPLE_TRACEPOINTS=m
> # CONFIG_SAMPLE_TRACE_EVENTS is not set
> CONFIG_HAVE_MMIOTRACE_SUPPORT=y
> 
> Sorry that I don't have time to look into this. If you need other info,
> just say.
> 
> Thanks!



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-03 13:05 ` Steven Rostedt
@ 2010-03-04  2:58   ` Américo Wang
  0 siblings, 0 replies; 37+ messages in thread
From: Américo Wang @ 2010-03-04  2:58 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Wed, Mar 3, 2010 at 9:05 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
>> Hi, Steven, Frederic, Ingo,
>>
>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
>> the soft lockup. On my machine, it is pretty easy, just run:
>>
>> echo function_graph > current_tracer
>>
>> I found this when 2.6.33-rc7, but forgot to report here. I just tried today's
>> Linus tree, it still has this problem.
>>
>> Attached are the full soft lockup warnings.
>>
>> Below is my related configs:
>
> Actually this is not enough for me. Could you send me the full config,
> as well as the full dmesg, not just the dump.
>

Full dmesg: http://pastebin.ca/1822197
Full config: http://pastebin.ca/1822201

Note, I changed some tracing unrelated config after getting that soft lockup,
so it would not matter.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-03  6:04 2.6.33: ftrace triggers soft lockup Américo Wang
  2010-03-03 13:05 ` Steven Rostedt
@ 2010-03-04  1:33 ` Li Zefan
  2010-03-04  1:46 ` Frederic Weisbecker
  2010-03-04 13:54 ` Steven Rostedt
  3 siblings, 0 replies; 37+ messages in thread
From: Li Zefan @ 2010-03-04  1:33 UTC (permalink / raw)
  To: Américo Wang; +Cc: LKML, Steven Rostedt, Frederic Weisbecker, Ingo Molnar

Américo Wang wrote:
> Hi, Steven, Frederic, Ingo,
> 
> I am not sure if this is ftrace's fault, but it is ftrace who triggers
> the soft lockup. On my machine, it is pretty easy, just run:
> 
> echo function_graph > current_tracer
> 
> I found this when 2.6.33-rc7, but forgot to report here. I just tried today's
> Linus tree, it still has this problem.
> 

I can also trigger soft lockup while running a stress test for ftrace.

I'll post detailed information after more testing.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-03  6:04 2.6.33: ftrace triggers soft lockup Américo Wang
  2010-03-03 13:05 ` Steven Rostedt
  2010-03-04  1:33 ` Li Zefan
@ 2010-03-04  1:46 ` Frederic Weisbecker
  2010-03-04  3:01   ` Américo Wang
  2010-03-04 13:54 ` Steven Rostedt
  3 siblings, 1 reply; 37+ messages in thread
From: Frederic Weisbecker @ 2010-03-04  1:46 UTC (permalink / raw)
  To: Américo Wang; +Cc: LKML, Steven Rostedt, Ingo Molnar

On Wed, Mar 03, 2010 at 02:04:23PM +0800, Américo Wang wrote:
> Mar  3 13:34:37 dhcp-66-70-5 kernel: BUG: soft lockup - CPU#3 stuck for 61s! [kstop/3:4876]
> Mar  3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s
> Mar  3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables]
> Mar  3 13:34:37 dhcp-66-70-5 kernel: irq event stamp: 51397
> Mar  3 13:34:37 dhcp-66-70-5 kernel: hardirqs last  enabled at (51396): [<ffffffff8155287c>] restore_args+0x0/0x30
> Mar  3 13:34:37 dhcp-66-70-5 kernel: hardirqs last disabled at (51397): [<ffffffff81003917>] save_args+0x67/0x70
> Mar  3 13:34:37 dhcp-66-70-5 kernel: softirqs last  enabled at (6536): [<ffffffff81003881>] return_to_handler+0x0/0x2f
> Mar  3 13:34:37 dhcp-66-70-5 kernel: softirqs last disabled at (6567): [<ffffffff81003881>] return_to_handler+0x0/0x2f
> Mar  3 13:34:37 dhcp-66-70-5 kernel: CPU 3 
> Mar  3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s
> Mar  3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables]
> Mar  3 13:34:37 dhcp-66-70-5 kernel: 
> Mar  3 13:34:38 dhcp-66-70-5 kernel: Pid: 4876, comm: kstop/3 Not tainted 2.6.33 #31 0J001K/PowerEdge T105 
> Mar  3 13:34:38 dhcp-66-70-5 kernel: RIP: 0010:[<ffffffff8100d3d7>]  [<ffffffff8100d3d7>] native_sched_clock+0x2a/0x69
> Mar  3 13:34:38 dhcp-66-70-5 kernel: RSP: 0018:ffff88002fe03da0  EFLAGS: 00000246
> Mar  3 13:34:38 dhcp-66-70-5 kernel: RAX: 00000000b24e3e78 RBX: ffff88002fe03da0 RCX: ffff88002fe03ee0
> Mar  3 13:34:39 dhcp-66-70-5 kernel: RDX: 0000000000000053 RSI: ffffffff810dfde6 RDI: ffffffff810e138d
> Mar  3 13:34:41 dhcp-66-70-5 kernel: RBP: ffffffff81003881 R08: 0000000000000110 R09: ffffffff81003896
> Mar  3 13:34:41 dhcp-66-70-5 kernel: R10: ffff88021697dfd8 R11: 0000000000000000 R12: ffff88002fe03d20
> Mar  3 13:34:41 dhcp-66-70-5 kernel: R13: ffffffff810e138d R14: ffff8802287a6680 R15: ffffffff81003881
> Mar  3 13:34:41 dhcp-66-70-5 kernel: FS:  00007fd9db23a720(0000) GS:ffff88002fe00000(0000) knlGS:0000000000000000
> Mar  3 13:34:41 dhcp-66-70-5 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> Mar  3 13:34:41 dhcp-66-70-5 kernel: CR2: 00000000008e42c0 CR3: 0000000219d90000 CR4: 00000000000006e0
> Mar  3 13:34:41 dhcp-66-70-5 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Mar  3 13:34:41 dhcp-66-70-5 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Mar  3 13:34:41 dhcp-66-70-5 kernel: Process kstop/3 (pid: 4876, threadinfo ffff88021697c000, task ffff8802287a6680)
> Mar  3 13:34:41 dhcp-66-70-5 kernel: Stack:
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  ffff88002fe03db0 ffffffff810e7407 ffff88002fe03df0 ffffffff810ff331
> Mar  3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002ffceec0 ffff88002fe03e98 ffffffff810e138d ffff8802287a6680
> Mar  3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002fe03e00 ffffffff810dfde6 ffff88002fe03e40 ffffffff8102cd97
> Mar  3 13:34:41 dhcp-66-70-5 kernel: Call Trace:
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  <IRQ> 
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810e7407>] ? trace_clock_local+0x2c/0x43
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810ff331>] ? ftrace_push_return_trace+0xac/0x17a
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810e138d>] ? __rcu_process_callbacks+0x1d4/0x4a9
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810dfde6>] ? cpu_needs_another_gp+0x4/0x46
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8102cd97>] ? prepare_ftrace_return+0xda/0x138
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003896>] ? return_to_handler+0x15/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810dfdeb>] ? cpu_needs_another_gp+0x9/0x46
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81008d3c>] print_context_stack+0xb6/0x144
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81007375>] dump_trace+0x320/0x3bf
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81008979>] show_trace_log_lvl+0x5f/0x72
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100703f>] show_stack_log_lvl+0x185/0x19b
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81006d50>] ? do_softirq+0x8c/0x181
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81007516>] show_registers+0x102/0x3ac
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100e739>] show_regs+0x1d/0x53
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810da649>] softlockup_tick+0x268/0x2d2
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  <EOI> 
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8102cce7>] ? prepare_ftrace_return+0x2a/0x138
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81066035>] ? do_exit+0x160/0xad0
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2f>] ? exit_signals+0xc/0x253
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81066044>] ? do_exit+0x16f/0xad0
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108ab21>] ? autoremove_wake_function+0x0/0x52
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81076327>] run_local_timers+0x32/0x3b
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81076547>] update_process_times+0x51/0xa5
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810048d4>] ? kernel_thread_helper+0x4/0x10
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8155287c>] ? restore_args+0x0/0x30
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108a4c9>] ? kthread+0x0/0xec
> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810048d0>] ? kernel_thread_helper+0x0/0x10


So it is stuck in stop machine. I wonder where exactly. I see some do_exit
at the top but I wonder how much they are reliable.

Anyway, as Steve said, we really need a full config to reproduce it.

Thanks.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-04  1:46 ` Frederic Weisbecker
@ 2010-03-04  3:01   ` Américo Wang
  2010-03-04  3:18     ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Américo Wang @ 2010-03-04  3:01 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: LKML, Steven Rostedt, Ingo Molnar

On Thu, Mar 4, 2010 at 9:46 AM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> On Wed, Mar 03, 2010 at 02:04:23PM +0800, Américo Wang wrote:
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: BUG: soft lockup - CPU#3 stuck for 61s! [kstop/3:4876]
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables]
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: irq event stamp: 51397
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: hardirqs last  enabled at (51396): [<ffffffff8155287c>] restore_args+0x0/0x30
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: hardirqs last disabled at (51397): [<ffffffff81003917>] save_args+0x67/0x70
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: softirqs last  enabled at (6536): [<ffffffff81003881>] return_to_handler+0x0/0x2f
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: softirqs last disabled at (6567): [<ffffffff81003881>] return_to_handler+0x0/0x2f
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: CPU 3
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: Modules linked in: ipt_REJECT xt_tcpudp x_tables bridge stp deflate zlib_deflate ctr cast5 crypto_null ccm serpent blowfish twofish twofish_common ecb xcbc cbc md5 sha256_generic sha512_generic des_generic aes_x86_64 aes_generic ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key autofs4 hidp rfcomm l2cap crc16 bluetooth lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_addr iscsi_tcp bnx2i cnic uio ipv6 cxgb3i iw_cxgb3 ib_core cxgb3 mdio libiscsi_tcp libiscsi scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video output sbs sbshc power_meter hwmon battery acpi_memhotplug ac parport_pc lp parport sg dcdbas sr_mod cdrom serio_raw rtc_cmos rtc_core rtc_lib tpm_tis tpm tpm_bios button tg3 libphy shpchp i2c_nforce2 i2c_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod sata_nv libata sd_mod s
>> Mar  3 13:34:37 dhcp-66-70-5 kernel: si_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: ip_tables]
>> Mar  3 13:34:37 dhcp-66-70-5 kernel:
>> Mar  3 13:34:38 dhcp-66-70-5 kernel: Pid: 4876, comm: kstop/3 Not tainted 2.6.33 #31 0J001K/PowerEdge T105
>> Mar  3 13:34:38 dhcp-66-70-5 kernel: RIP: 0010:[<ffffffff8100d3d7>]  [<ffffffff8100d3d7>] native_sched_clock+0x2a/0x69
>> Mar  3 13:34:38 dhcp-66-70-5 kernel: RSP: 0018:ffff88002fe03da0  EFLAGS: 00000246
>> Mar  3 13:34:38 dhcp-66-70-5 kernel: RAX: 00000000b24e3e78 RBX: ffff88002fe03da0 RCX: ffff88002fe03ee0
>> Mar  3 13:34:39 dhcp-66-70-5 kernel: RDX: 0000000000000053 RSI: ffffffff810dfde6 RDI: ffffffff810e138d
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: RBP: ffffffff81003881 R08: 0000000000000110 R09: ffffffff81003896
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: R10: ffff88021697dfd8 R11: 0000000000000000 R12: ffff88002fe03d20
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: R13: ffffffff810e138d R14: ffff8802287a6680 R15: ffffffff81003881
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: FS:  00007fd9db23a720(0000) GS:ffff88002fe00000(0000) knlGS:0000000000000000
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: CR2: 00000000008e42c0 CR3: 0000000219d90000 CR4: 00000000000006e0
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: Process kstop/3 (pid: 4876, threadinfo ffff88021697c000, task ffff8802287a6680)
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: Stack:
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  ffff88002fe03db0 ffffffff810e7407 ffff88002fe03df0 ffffffff810ff331
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002ffceec0 ffff88002fe03e98 ffffffff810e138d ffff8802287a6680
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: <0> ffff88002fe03e00 ffffffff810dfde6 ffff88002fe03e40 ffffffff8102cd97
>> Mar  3 13:34:41 dhcp-66-70-5 kernel: Call Trace:
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  <IRQ>
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810e7407>] ? trace_clock_local+0x2c/0x43
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810ff331>] ? ftrace_push_return_trace+0xac/0x17a
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810e138d>] ? __rcu_process_callbacks+0x1d4/0x4a9
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810dfde6>] ? cpu_needs_another_gp+0x4/0x46
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8102cd97>] ? prepare_ftrace_return+0xda/0x138
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003896>] ? return_to_handler+0x15/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810dfdeb>] ? cpu_needs_another_gp+0x9/0x46
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81008d3c>] print_context_stack+0xb6/0x144
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81007375>] dump_trace+0x320/0x3bf
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81008979>] show_trace_log_lvl+0x5f/0x72
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100703f>] show_stack_log_lvl+0x185/0x19b
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81006d50>] ? do_softirq+0x8c/0x181
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81007516>] show_registers+0x102/0x3ac
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100e739>] show_regs+0x1d/0x53
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810da649>] softlockup_tick+0x268/0x2d2
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  <EOI>
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2a>] ? exit_signals+0x7/0x253
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8102cce7>] ? prepare_ftrace_return+0x2a/0x138
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81066035>] ? do_exit+0x160/0xad0
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8100385a>] ? ftrace_graph_caller+0x4a/0x71
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8107de2f>] ? exit_signals+0xc/0x253
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81066044>] ? do_exit+0x16f/0xad0
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108ab21>] ? autoremove_wake_function+0x0/0x52
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108568d>] ? worker_thread+0x0/0x46c
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81076327>] run_local_timers+0x32/0x3b
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81003881>] ? return_to_handler+0x0/0x2f
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff81076547>] update_process_times+0x51/0xa5
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810048d4>] ? kernel_thread_helper+0x4/0x10
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8155287c>] ? restore_args+0x0/0x30
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff8108a4c9>] ? kthread+0x0/0xec
>> Mar  3 13:34:41 dhcp-66-70-5 kernel:  [<ffffffff810048d0>] ? kernel_thread_helper+0x0/0x10
>
>
> So it is stuck in stop machine. I wonder where exactly. I see some do_exit
> at the top but I wonder how much they are reliable.

Well, I think 'kstop' is just random, sometimes I got 'watchdog' or some other
process.

>
> Anyway, as Steve said, we really need a full config to reproduce it.
>

Done in another reply.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-04  3:01   ` Américo Wang
@ 2010-03-04  3:18     ` Steven Rostedt
  2010-03-04  4:06       ` Frederic Weisbecker
  2010-03-04  5:10       ` Américo Wang
  0 siblings, 2 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-04  3:18 UTC (permalink / raw)
  To: Américo Wang; +Cc: Frederic Weisbecker, LKML, Ingo Molnar, Peter Zijlstra

On Thu, 2010-03-04 at 11:01 +0800, Américo Wang wrote:

> >
> > So it is stuck in stop machine. I wonder where exactly. I see some do_exit
> > at the top but I wonder how much they are reliable.
> 
> Well, I think 'kstop' is just random, sometimes I got 'watchdog' or some other
> process.
> 
> >
> > Anyway, as Steve said, we really need a full config to reproduce it.
> >
> 
> Done in another reply.

Thanks!

Frederic, I notice that lockdep is on, did anything change that might
slow down the code in lockdep, or is the function graph tracer doing
more locking?

I'm betting that we are hitting a live lock. That is, an interrupt goes
off, it is being traced, and the function graph is tracing it, but some
locking is happening (although it also tracks disabling of interrupts)
and this slows the interrupt handler down enough that when it finishes,
another interrupt goes off.

Américo,

Could you disable LOCKDEP and see if you still encounter this lockup?

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-04  3:18     ` Steven Rostedt
@ 2010-03-04  4:06       ` Frederic Weisbecker
  2010-03-04  5:10       ` Américo Wang
  1 sibling, 0 replies; 37+ messages in thread
From: Frederic Weisbecker @ 2010-03-04  4:06 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Américo Wang, LKML, Ingo Molnar, Peter Zijlstra

On Wed, Mar 03, 2010 at 10:18:26PM -0500, Steven Rostedt wrote:
> On Thu, 2010-03-04 at 11:01 +0800, Américo Wang wrote:
> 
> > >
> > > So it is stuck in stop machine. I wonder where exactly. I see some do_exit
> > > at the top but I wonder how much they are reliable.
> > 
> > Well, I think 'kstop' is just random, sometimes I got 'watchdog' or some other
> > process.
> > 
> > >
> > > Anyway, as Steve said, we really need a full config to reproduce it.
> > >
> > 
> > Done in another reply.
> 
> Thanks!
> 
> Frederic, I notice that lockdep is on, did anything change that might
> slow down the code in lockdep, or is the function graph tracer doing
> more locking?


No,
there doesn't seem to be intrusive changes in lockdep between .32 and .33
And the function graph tracer is still fully lockless (and hopefully
it will be forever).


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-04  3:18     ` Steven Rostedt
  2010-03-04  4:06       ` Frederic Weisbecker
@ 2010-03-04  5:10       ` Américo Wang
  2010-03-04  5:18         ` Frederic Weisbecker
  1 sibling, 1 reply; 37+ messages in thread
From: Américo Wang @ 2010-03-04  5:10 UTC (permalink / raw)
  To: rostedt; +Cc: Frederic Weisbecker, LKML, Ingo Molnar, Peter Zijlstra

On Thu, Mar 4, 2010 at 11:18 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Thu, 2010-03-04 at 11:01 +0800, Américo Wang wrote:
>
>> >
>> > So it is stuck in stop machine. I wonder where exactly. I see some do_exit
>> > at the top but I wonder how much they are reliable.
>>
>> Well, I think 'kstop' is just random, sometimes I got 'watchdog' or some other
>> process.
>>
>> >
>> > Anyway, as Steve said, we really need a full config to reproduce it.
>> >
>>
>> Done in another reply.
>
> Thanks!
>
> Frederic, I notice that lockdep is on, did anything change that might
> slow down the code in lockdep, or is the function graph tracer doing
> more locking?
>
> I'm betting that we are hitting a live lock. That is, an interrupt goes
> off, it is being traced, and the function graph is tracing it, but some
> locking is happening (although it also tracks disabling of interrupts)
> and this slows the interrupt handler down enough that when it finishes,
> another interrupt goes off.
>
> Américo,
>
> Could you disable LOCKDEP and see if you still encounter this lockup?
>

Sure, after disabling LOCKDEP, I can't see the warning, _but_ the system
is still as unacceptablly slow as when LOCKDEP was enabled.

So I am afraid the problem is still there.

Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-04  5:10       ` Américo Wang
@ 2010-03-04  5:18         ` Frederic Weisbecker
  0 siblings, 0 replies; 37+ messages in thread
From: Frederic Weisbecker @ 2010-03-04  5:18 UTC (permalink / raw)
  To: Américo Wang; +Cc: rostedt, LKML, Ingo Molnar, Peter Zijlstra

On Thu, Mar 04, 2010 at 01:10:16PM +0800, Américo Wang wrote:
> On Thu, Mar 4, 2010 at 11:18 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > On Thu, 2010-03-04 at 11:01 +0800, Américo Wang wrote:
> >
> >> >
> >> > So it is stuck in stop machine. I wonder where exactly. I see some do_exit
> >> > at the top but I wonder how much they are reliable.
> >>
> >> Well, I think 'kstop' is just random, sometimes I got 'watchdog' or some other
> >> process.
> >>
> >> >
> >> > Anyway, as Steve said, we really need a full config to reproduce it.
> >> >
> >>
> >> Done in another reply.
> >
> > Thanks!
> >
> > Frederic, I notice that lockdep is on, did anything change that might
> > slow down the code in lockdep, or is the function graph tracer doing
> > more locking?
> >
> > I'm betting that we are hitting a live lock. That is, an interrupt goes
> > off, it is being traced, and the function graph is tracing it, but some
> > locking is happening (although it also tracks disabling of interrupts)
> > and this slows the interrupt handler down enough that when it finishes,
> > another interrupt goes off.
> >
> > Américo,
> >
> > Could you disable LOCKDEP and see if you still encounter this lockup?
> >
> 
> Sure, after disabling LOCKDEP, I can't see the warning, _but_ the system
> is still as unacceptablly slow as when LOCKDEP was enabled.



Looks like a progress. It doesn't appear to be a true lockup but more a
starvation or a livelock.

I'm building your config, hopefully I could reproduce.

Thanks.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-03  6:04 2.6.33: ftrace triggers soft lockup Américo Wang
                   ` (2 preceding siblings ...)
  2010-03-04  1:46 ` Frederic Weisbecker
@ 2010-03-04 13:54 ` Steven Rostedt
  2010-03-05  4:14   ` Américo Wang
  3 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-04 13:54 UTC (permalink / raw)
  To: Américo Wang; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
> I am not sure if this is ftrace's fault, but it is ftrace who triggers
> the soft lockup. On my machine, it is pretty easy, just run:
> 
> echo function_graph > current_tracer
> 
> I found this when 2.6.33-rc7, but forgot to report here. I just tried
> today's
> Linus tree, it still has this problem.

So this is not present in 2.6.33-rc6? If you know approximately when it
appeared, would you be able to do a git bisect on it?

Thanks,

-- Steve


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-04 13:54 ` Steven Rostedt
@ 2010-03-05  4:14   ` Américo Wang
  2010-03-05  7:16     ` Américo Wang
  0 siblings, 1 reply; 37+ messages in thread
From: Américo Wang @ 2010-03-05  4:14 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Thu, Mar 4, 2010 at 9:54 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
>> the soft lockup. On my machine, it is pretty easy, just run:
>>
>> echo function_graph > current_tracer
>>
>> I found this when 2.6.33-rc7, but forgot to report here. I just tried
>> today's
>> Linus tree, it still has this problem.
>
> So this is not present in 2.6.33-rc6? If you know approximately when it
> appeared, would you be able to do a git bisect on it?
>

I can't say that because I didn't try -rc6.

Ok, let me try to do a bisect, but since weekend is coming, I can't do
any testing during weekend, so will try it next Monday.

Thanks!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05  4:14   ` Américo Wang
@ 2010-03-05  7:16     ` Américo Wang
  2010-03-05 15:06       ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Américo Wang @ 2010-03-05  7:16 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Fri, Mar 5, 2010 at 12:14 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Mar 4, 2010 at 9:54 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
>>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
>>> the soft lockup. On my machine, it is pretty easy, just run:
>>>
>>> echo function_graph > current_tracer
>>>
>>> I found this when 2.6.33-rc7, but forgot to report here. I just tried
>>> today's
>>> Linus tree, it still has this problem.
>>
>> So this is not present in 2.6.33-rc6? If you know approximately when it
>> appeared, would you be able to do a git bisect on it?
>>
>
> I can't say that because I didn't try -rc6.
>

Sigh, 2.6.33-rc6 doesn't work, even 2.6.32 doesn't work...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05  7:16     ` Américo Wang
@ 2010-03-05 15:06       ` Steven Rostedt
  2010-03-05 17:16         ` Frederic Weisbecker
  2010-03-08  2:31         ` Américo Wang
  0 siblings, 2 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-05 15:06 UTC (permalink / raw)
  To: Américo Wang; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Fri, 2010-03-05 at 15:16 +0800, Américo Wang wrote:
> On Fri, Mar 5, 2010 at 12:14 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
> > On Thu, Mar 4, 2010 at 9:54 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
> >>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
> >>> the soft lockup. On my machine, it is pretty easy, just run:
> >>>
> >>> echo function_graph > current_tracer
> >>>

> >
> > I can't say that because I didn't try -rc6.
> >
> 
> Sigh, 2.6.33-rc6 doesn't work, even 2.6.32 doesn't work...

So basically you are saying that the function_graph tracer, when enabled
has a high overhead? Well, unfortunately, that's expected.

The function_graph tracer traces the start and end of every function. It
uses the same mechanism as function tracer to trace the start of the
function (mcount), but to trace the exit of a function, in the enter of
the function it hijacks the return address and replaces it to call a
trampoline. This trampoline will do the trace and then jump back to the
original return address.

Doing this breaks branch prediction in the CPU, as the CPU uses call/ret
as part of its branch prediction analysis. So function graph tracing is
not just twice as slow as function tracing, it actually has a bigger
impact than that.

So my question to you is, have you seen the function graph perform
better with the same configs in previous kernels? Also, the function
graph makes other debugging (like lockdep) have a greater impact to
performance than they usually do.

Now some things you can do to help performance. One is not to trace
functions that are known to have a high hit rate. You can do this with
the set_ftrace_notrace file, or add "ftrace_notrace=func1,func2,func3"
to the command line where func1,func2,func3 are the functions you do not
want to trace. This just adds these by default to the set_ftrace_notrace
and can be removed at runtime.

The functions I commonly write to are:

echo '*spin_lock*' '*spin_unlock*' '*spin_try*' '*rcu_read*' > set_ftace_notrace

since these functions are hit quite intensively, by not tracing them it
helps a bit with performance.

-- Steve

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05 15:06       ` Steven Rostedt
@ 2010-03-05 17:16         ` Frederic Weisbecker
  2010-03-05 18:35           ` Steven Rostedt
  2010-03-08  7:32           ` Li Zefan
  2010-03-08  2:31         ` Américo Wang
  1 sibling, 2 replies; 37+ messages in thread
From: Frederic Weisbecker @ 2010-03-05 17:16 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Américo Wang, LKML, Ingo Molnar

On Fri, Mar 05, 2010 at 10:06:43AM -0500, Steven Rostedt wrote:
> On Fri, 2010-03-05 at 15:16 +0800, Américo Wang wrote:
> > On Fri, Mar 5, 2010 at 12:14 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
> > > On Thu, Mar 4, 2010 at 9:54 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> > >> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
> > >>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
> > >>> the soft lockup. On my machine, it is pretty easy, just run:
> > >>>
> > >>> echo function_graph > current_tracer
> > >>>
> 
> > >
> > > I can't say that because I didn't try -rc6.
> > >
> > 
> > Sigh, 2.6.33-rc6 doesn't work, even 2.6.32 doesn't work...
> 
> So basically you are saying that the function_graph tracer, when enabled
> has a high overhead? Well, unfortunately, that's expected.
> 
> The function_graph tracer traces the start and end of every function. It
> uses the same mechanism as function tracer to trace the start of the
> function (mcount), but to trace the exit of a function, in the enter of
> the function it hijacks the return address and replaces it to call a
> trampoline. This trampoline will do the trace and then jump back to the
> original return address.
> 
> Doing this breaks branch prediction in the CPU, as the CPU uses call/ret
> as part of its branch prediction analysis. So function graph tracing is
> not just twice as slow as function tracing, it actually has a bigger
> impact than that.


It's true it has a high overhead, but not to the point of
making the whole system unusable. We are supposed to be even
far from that. I'm currently able to turn on the function graph
tracer and use firefox without problems. It's just a bit slower
but it's far from a visible starvation.

And Li seems to see the same thing.
For now I can not test, but I will try this week-end.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05 17:16         ` Frederic Weisbecker
@ 2010-03-05 18:35           ` Steven Rostedt
  2010-03-05 18:43             ` Frederic Weisbecker
  2010-03-08  7:32           ` Li Zefan
  1 sibling, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-05 18:35 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Américo Wang, LKML, Ingo Molnar

On Fri, 2010-03-05 at 18:16 +0100, Frederic Weisbecker wrote:

> It's true it has a high overhead, but not to the point of
> making the whole system unusable. We are supposed to be even
> far from that. I'm currently able to turn on the function graph
> tracer and use firefox without problems. It's just a bit slower
> but it's far from a visible starvation.
> 
> And Li seems to see the same thing.
> For now I can not test, but I will try this week-end.

Americo said he's seen the issue as far back as 2.6.32. So perhaps some
CPUs take a bigger hit from the function graph tracer than others. I
have several different boxes that I can try. I've seen noticeable slow
downs but never something that cripples the box.

The only time that I've seen it cripple the box is when LOCKDEP_DEBUG
was set (which according to Americo's config it was not). But that's
because LOCKDEP_DEBUG updates a global variable every time interrupts
are enabled or disabled. This caused a huge cache line bouncing with the
function graph tracer since it caused this variable to be updated 4
times for every function call!

-- Steve

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05 18:35           ` Steven Rostedt
@ 2010-03-05 18:43             ` Frederic Weisbecker
  2010-03-05 19:07               ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Frederic Weisbecker @ 2010-03-05 18:43 UTC (permalink / raw)
  To: Steven Rostedt, Peter Zijlstra; +Cc: Américo Wang, LKML, Ingo Molnar

On Fri, Mar 05, 2010 at 01:35:00PM -0500, Steven Rostedt wrote:
> On Fri, 2010-03-05 at 18:16 +0100, Frederic Weisbecker wrote:
> 
> > It's true it has a high overhead, but not to the point of
> > making the whole system unusable. We are supposed to be even
> > far from that. I'm currently able to turn on the function graph
> > tracer and use firefox without problems. It's just a bit slower
> > but it's far from a visible starvation.
> > 
> > And Li seems to see the same thing.
> > For now I can not test, but I will try this week-end.
> 
> Americo said he's seen the issue as far back as 2.6.32. So perhaps some
> CPUs take a bigger hit from the function graph tracer than others. I
> have several different boxes that I can try. I've seen noticeable slow
> downs but never something that cripples the box.
> 
> The only time that I've seen it cripple the box is when LOCKDEP_DEBUG
> was set (which according to Americo's config it was not). But that's
> because LOCKDEP_DEBUG updates a global variable every time interrupts
> are enabled or disabled. This caused a huge cache line bouncing with the
> function graph tracer since it caused this variable to be updated 4
> times for every function call!



Ouch...that's the hardirqs_off_events/redundant_hardirqs_off variables?
Those should be clearly made per cpu.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05 18:43             ` Frederic Weisbecker
@ 2010-03-05 19:07               ` Steven Rostedt
  0 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-05 19:07 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Peter Zijlstra, Américo Wang, LKML, Ingo Molnar

On Fri, 2010-03-05 at 19:43 +0100, Frederic Weisbecker wrote:

> Ouch...that's the hardirqs_off_events/redundant_hardirqs_off variables?
> Those should be clearly made per cpu.

Yeah, probably all the debug_atomic_inc(*) variables should be made
per-cpu. Should not be too hard since they are already wrapped in
macros.

-- Steve




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05 17:16         ` Frederic Weisbecker
  2010-03-05 18:35           ` Steven Rostedt
@ 2010-03-08  7:32           ` Li Zefan
  2010-03-08 15:15             ` Steven Rostedt
  1 sibling, 1 reply; 37+ messages in thread
From: Li Zefan @ 2010-03-08  7:32 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Steven Rostedt, Américo Wang, LKML, Ingo Molnar

>>>>>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
>>>>>> the soft lockup. On my machine, it is pretty easy, just run:
>>>>>>
>>>>>> echo function_graph > current_tracer
>>>>>>
>>>> I can't say that because I didn't try -rc6.
>>>>
>>> Sigh, 2.6.33-rc6 doesn't work, even 2.6.32 doesn't work...
>> So basically you are saying that the function_graph tracer, when enabled
>> has a high overhead? Well, unfortunately, that's expected.
>>
>> The function_graph tracer traces the start and end of every function. It
>> uses the same mechanism as function tracer to trace the start of the
>> function (mcount), but to trace the exit of a function, in the enter of
>> the function it hijacks the return address and replaces it to call a
>> trampoline. This trampoline will do the trace and then jump back to the
>> original return address.
>>
>> Doing this breaks branch prediction in the CPU, as the CPU uses call/ret
>> as part of its branch prediction analysis. So function graph tracing is
>> not just twice as slow as function tracing, it actually has a bigger
>> impact than that.
> 
> 
> It's true it has a high overhead, but not to the point of
> making the whole system unusable. We are supposed to be even
> far from that. I'm currently able to turn on the function graph
> tracer and use firefox without problems. It's just a bit slower
> but it's far from a visible starvation.
> 
> And Li seems to see the same thing.

Yes, and more than this. I can see segmentation fault while testing,
both my testing threads and kernel building threads that are running
at the same time can get segfault.

> For now I can not test, but I will try this week-end.
> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-08  7:32           ` Li Zefan
@ 2010-03-08 15:15             ` Steven Rostedt
  2010-03-09  2:11               ` Li Zefan
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-08 15:15 UTC (permalink / raw)
  To: Li Zefan; +Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar

On Mon, 2010-03-08 at 15:32 +0800, Li Zefan wrote:

> > And Li seems to see the same thing.
> 
> Yes, and more than this. I can see segmentation fault while testing,
> both my testing threads and kernel building threads that are running
> at the same time can get segfault.

Segfaults??

This sounds like the compiler may have screwed up again. Can you compile
with this patch and see if it catches anything?

-- Steve

commit 7e91e8a8cc4e2269146107dc91455a7f70a360b4
Author: Steven Rostedt <srostedt@redhat.com>
Date:   Thu Nov 19 23:41:02 2009 -0500

    tracing/x86: Add check to detect GCC messing with mcount prologue
    
    Latest versions of GCC create a funny prologue for some functions.
    Instead of the typical:
    
    		push   %ebp
    		mov    %esp,%ebp
    		and    $0xffffffe0,%esp
    		[...]
    		call   mcount
    
    GCC may try to align the stack before setting up the frame pointer
    register:
    
    		push   %edi
    		lea    0x8(%esp),%edi
    		and    $0xffffffe0,%esp
    		pushl  -0x4(%edi)
    		push   %ebp
    		mov    %esp,%ebp
    		[...]
    		call   mcount
    
    This crazy code places a copy of the return address into the
    frame pointer. The function graph tracer uses this pointer to
    save and replace the return address of the calling function to jump
    to the function graph tracer's return handler, which will put back
    the return address. But instead instead of the typical return:
    
    		mov    %ebp,%esp
    		pop    %ebp
    		ret
    
    The return of the function performs:
    
    		lea    -0x8(%edi),%esp
    		pop    %edi
    		ret
    
    The function graph tracer return handler will not be called at the exit
    of the function, but the parent function will call it. Because we missed
    the return of the child function, the handler will replace the parent's
    return address with that of the child. Obviously this will cause a crash
    (Note, there is code to detect this case and safely panic the kernel).
    
    The kicker is that this happens to just a handful of functions.
    And only with certain gcc options.
    
    Compiling with: 	-march=pentium-mmx
    will cause the problem to appear. But if you were to change
    pentium-mmx to i686 or add -mtune=generic, then the problem goes away.
    
    I first saw this problem when compiling with optimize for size.
    But it seems that various other options may cause this issue to arise.
    
    Instead of completely disabling the function graph tracer for i386 builds
    this patch adds a check to recordmcount.pl to make sure that all
    functions that contain a call to mcount start with "push %ebp".
    If not, it will fail the compile and print out the nasty warning:
    
      CC      kernel/time/timer_stats.o
    
    ********************************************************
      Your version of GCC breaks the function graph tracer
      Please disable CONFIG_FUNCTION_GRAPH_TRACER
      Failed function was "timer_stats_update_stats"
    ********************************************************
    
    The script recordmcount.pl is given a new parameter "do_check". If
    this is negative, the script will only perform this check without
    creating the mcount caller section. This will be executed for x86_32
    when CONFIG_FUNCTION_GRAPH_TRACER is enabled and CONFIG_DYNAMIC_FTRACE
    is not.
    
    If the arch is x86_32 and $do_check is greater than 1, it will perform
    the check while processing the mcount callers. If $do_check is 0, then
    no check will be performed. This is for non x86_32 archs and when
    compiling without CONFIG_FUNCTION_GRAPH_TRACER enabled, even on x86_32.
    
    Reported-by: Thomas Gleixner <tglx@linutronix.de>
    LKML-Reference: <alpine.LFD.2.00.0911191423190.24119@localhost.localdomain>
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index d006554..380e815 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -143,7 +143,6 @@ config FUNCTION_GRAPH_TRACER
 	bool "Kernel Function Graph Tracer"
 	depends on HAVE_FUNCTION_GRAPH_TRACER
 	depends on FUNCTION_TRACER
-	depends on !X86_32 || !CC_OPTIMIZE_FOR_SIZE
 	default y
 	help
 	  Enable the kernel to trace a function at both its return
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 341b589..3b897f2 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -206,10 +206,33 @@ cmd_modversions =							\
 endif
 
 ifdef CONFIG_FTRACE_MCOUNT_RECORD
+
+ ifdef CONFIG_FUNCTION_GRAPH_TRACER
+  ifdef CONFIG_X86_32
+   rm_do_check = 1
+  else
+   rm_do_check = 0
+  endif
+ else
+  rm_do_check = 0
+ endif
+
 cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
 	"$(if $(CONFIG_64BIT),64,32)" \
 	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
-	"$(if $(part-of-module),1,0)" "$(@)";
+	"$(if $(part-of-module),1,0)" "$(rm_do_check)" "$(@)";
+
+else
+
+ ifdef CONFIG_X86_32
+ ifdef CONFIG_FUNCTION_GRAPH_TRACER
+   cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
+	"$(if $(CONFIG_64BIT),64,32)" \
+	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
+	"$(if $(part-of-module),1,0)" "-1" "$(@)";
+ endif
+ endif
+
 endif
 
 define rule_cc_o_c
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index f0d1445..f3b45a5 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -113,14 +113,14 @@ $P =~ s@.*/@@g;
 
 my $V = '0.1';
 
-if ($#ARGV != 10) {
-	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module inputfile\n";
+if ($#ARGV < 11) {
+	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module do_check inputfile\n";
 	print "version: $V\n";
 	exit(1);
 }
 
 my ($arch, $bits, $objdump, $objcopy, $cc,
-    $ld, $nm, $rm, $mv, $is_module, $inputfile) = @ARGV;
+    $ld, $nm, $rm, $mv, $is_module, $do_check, $inputfile) = @ARGV;
 
 # This file refers to mcount and shouldn't be ftraced, so lets' ignore it
 if ($inputfile =~ m,kernel/trace/ftrace\.o$,) {
@@ -144,6 +144,60 @@ $nm = "nm" if ((length $nm) == 0);
 $rm = "rm" if ((length $rm) == 0);
 $mv = "mv" if ((length $mv) == 0);
 
+# gcc can do stupid things with the stack pointer on x86_32.
+# It may pass a copy of the return address to mcount, which will
+# break the function graph tracer. If this happens then we need
+# to flag it and break the build.
+#
+# For x86_32, the parameter do_check will be negative if we only
+# want to perform the check, and positive if we should od the check.
+# If function graph tracing is not enabled, do_check will be zero.
+#
+
+my $check_next_line = 0;
+my $line_failed = 0;
+my $last_function;
+
+sub test_x86_32_prologue
+{
+    if ($check_next_line) {
+	if (!/push\s*\%ebp/) {
+	    $line_failed = 1;
+	}
+    }
+
+    if ($line_failed && /mcount/) {
+	print STDERR "\n********************************************************\n";
+	print STDERR "  Your version of GCC breaks the function graph tracer\n";
+	print STDERR "  Please disable CONFIG_FUNCTION_GRAPH_TRACER\n";
+	print STDERR "  Failed function was \"$last_function\"\n";
+	print STDERR "********************************************************\n\n";
+	exit -1;
+    }
+    $check_next_line = 0;
+
+    # check the first line after a function starts for
+    #  push %ebp
+    if (/^[0-9a-fA-F]+\s+<([a-zA-Z_].*)>:$/) {
+	$last_function = $1;
+	$check_next_line = 1;
+	$line_failed = 0;
+    }
+}
+
+if ($do_check < 0) {
+    # Only perform the check and quit
+    open(IN, "$objdump -dr $inputfile|") || die "error running $objdump";
+
+    while (<IN>) {
+	test_x86_32_prologue;
+    }
+    close (IN);
+    exit 0;
+}
+
+my $check = 0;
+
 #print STDERR "running: $P '$arch' '$objdump' '$objcopy' '$cc' '$ld' " .
 #    "'$nm' '$rm' '$mv' '$inputfile'\n";
 
@@ -202,6 +256,12 @@ if ($arch eq "x86") {
     }
 }
 
+if ($arch eq "i386") {
+    if ($do_check) {
+	$check = 1;
+    }
+}
+
 #
 # We base the defaults off of i386, the other archs may
 # feel free to change them in the below if statements.
@@ -401,6 +461,13 @@ my $read_headers = 1;
 
 while (<IN>) {
 
+    # x86_32 may need to test the start of every function to see
+    # if GCC did not mess up the mcount prologue. All functions must
+    # start with push %ebp, otherwise it is broken.
+    if ($check) {
+	test_x86_32_prologue;
+    }
+
     if ($read_headers && /$mcount_section/) {
 	#
 	# Somehow the make process can execute this script on an



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-08 15:15             ` Steven Rostedt
@ 2010-03-09  2:11               ` Li Zefan
  2010-03-09  2:23                 ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Li Zefan @ 2010-03-09  2:11 UTC (permalink / raw)
  To: rostedt; +Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar

Steven Rostedt wrote:
> On Mon, 2010-03-08 at 15:32 +0800, Li Zefan wrote:
> 
>>> And Li seems to see the same thing.
>> Yes, and more than this. I can see segmentation fault while testing,
>> both my testing threads and kernel building threads that are running
>> at the same time can get segfault.
> 
> Segfaults??
> 

Yes.

> This sounds like the compiler may have screwed up again. Can you compile
> with this patch and see if it catches anything?
> 

But I was using a x86_64 machine, and this patch won't do anything on
x86_64?



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-09  2:11               ` Li Zefan
@ 2010-03-09  2:23                 ` Steven Rostedt
  2010-03-09  3:38                   ` Li Zefan
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-09  2:23 UTC (permalink / raw)
  To: Li Zefan; +Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar

On Tue, 2010-03-09 at 10:11 +0800, Li Zefan wrote:
> Steven Rostedt wrote:
> > On Mon, 2010-03-08 at 15:32 +0800, Li Zefan wrote:

> > This sounds like the compiler may have screwed up again. Can you compile
> > with this patch and see if it catches anything?
> > 
> 
> But I was using a x86_64 machine, and this patch won't do anything on
> x86_64?

I wonder if this issue has crept to x86_64. What kind of dump do you
get? And if this segfault is in the kernel, could you see (using gdb on
a vmlinux) if any of the stack frame setups look peculiar?

Thanks,

-- Steve




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-09  2:23                 ` Steven Rostedt
@ 2010-03-09  3:38                   ` Li Zefan
  2010-03-12 19:20                     ` Steven Rostedt
  2010-03-13  2:07                     ` Steven Rostedt
  0 siblings, 2 replies; 37+ messages in thread
From: Li Zefan @ 2010-03-09  3:38 UTC (permalink / raw)
  To: rostedt; +Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar

Steven Rostedt wrote:
> On Tue, 2010-03-09 at 10:11 +0800, Li Zefan wrote:
>> Steven Rostedt wrote:
>>> On Mon, 2010-03-08 at 15:32 +0800, Li Zefan wrote:
> 
>>> This sounds like the compiler may have screwed up again. Can you compile
>>> with this patch and see if it catches anything?
>>>
>> But I was using a x86_64 machine, and this patch won't do anything on
>> x86_64?
> 
> I wonder if this issue has crept to x86_64. What kind of dump do you
> get? And if this segfault is in the kernel, could you see (using gdb on
> a vmlinux) if any of the stack frame setups look peculiar?
> 

The segfault is from userspace. This is a segfault I just got:

ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-09  3:38                   ` Li Zefan
@ 2010-03-12 19:20                     ` Steven Rostedt
  2010-03-12 21:38                       ` Steven Rostedt
  2010-03-13  2:07                     ` Steven Rostedt
  1 sibling, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-12 19:20 UTC (permalink / raw)
  To: Li Zefan
  Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar,
	Lai Jiangshan

On Tue, 2010-03-09 at 11:38 +0800, Li Zefan wrote:

> The segfault is from userspace. This is a segfault I just got:
> 
> ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]

I've spent the last few days hunting this down (and I'm still hunting).
I don't think this is caused by the function graph tracer (I triggered
it once when it was running the wakeup rt test).

One thing that I noticed so far in all my crashes is that there's two
things that seem to always be enabled.

The lock events are enabled, and more importantly, the userstacktrace
option is also enabled.

What I do is when this segfault is triggered, I save off the ring
buffer. I can retrieve the ring buffer later by echoing something into a
file I added called "swap_buffers". Below is the patch if you are
interested.

I already found fixes for the ring buffer reset crash (which I should
probably send out soon). But I've been trying to get my head around this
segfault, with no luck yet.

Oh, also, do you have "rep_good" set on the box you see this with?

$ grep rep_good /proc/cpuinfo

I also want to know the importance of that "fe2e" that it faults at.
Seems to be the address most of the time. I did get a "1" before.

Thanks!

-- Steve

---
 arch/x86/mm/fault.c        |    3 +
 kernel/trace/ring_buffer.c |    5 ++
 kernel/trace/trace.c       |   84 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 91 insertions(+), 1 deletion(-)

Index: linux-trace.git/arch/x86/mm/fault.c
===================================================================
--- linux-trace.git.orig/arch/x86/mm/fault.c	2010-01-28 13:01:31.000000000 -0500
+++ linux-trace.git/arch/x86/mm/fault.c	2010-03-12 14:11:19.000000000 -0500
@@ -676,6 +676,7 @@ no_context(struct pt_regs *regs, unsigne
 	oops_end(flags, regs, sig);
 }
 
+extern void sdr_save_trace(void);
 /*
  * Print out info about fatal segfaults, if the show_unhandled_signals
  * sysctl is set:
@@ -690,6 +691,8 @@ show_signal_msg(struct pt_regs *regs, un
 	if (!printk_ratelimit())
 		return;
 
+	sdr_save_trace();
+
 	printk("%s%s[%d]: segfault at %lx ip %p sp %p error %lx",
 		task_pid_nr(tsk) > 1 ? KERN_INFO : KERN_EMERG,
 		tsk->comm, task_pid_nr(tsk), address,
Index: linux-trace.git/kernel/trace/ring_buffer.c
===================================================================
--- linux-trace.git.orig/kernel/trace/ring_buffer.c	2010-03-09 14:14:06.000000000 -0500
+++ linux-trace.git/kernel/trace/ring_buffer.c	2010-03-11 14:30:45.000000000 -0500
@@ -3404,6 +3404,7 @@ rb_reset_cpu(struct ring_buffer_per_cpu 
  * @buffer: The ring buffer to reset a per cpu buffer of
  * @cpu: The CPU buffer to be reset
  */
+void sdr_save_trace(void);
 void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu)
 {
 	struct ring_buffer_per_cpu *cpu_buffer = buffer->buffers[cpu];
@@ -3416,8 +3417,10 @@ void ring_buffer_reset_cpu(struct ring_b
 
 	spin_lock_irqsave(&cpu_buffer->reader_lock, flags);
 
-	if (RB_WARN_ON(cpu_buffer, local_read(&cpu_buffer->committing)))
+	if (RB_WARN_ON(cpu_buffer, local_read(&cpu_buffer->committing))) {
+		sdr_save_trace();
 		goto out;
+	}
 
 	arch_spin_lock(&cpu_buffer->lock);
 
Index: linux-trace.git/kernel/trace/trace.c
===================================================================
--- linux-trace.git.orig/kernel/trace/trace.c	2010-03-09 09:32:22.000000000 -0500
+++ linux-trace.git/kernel/trace/trace.c	2010-03-12 14:11:19.000000000 -0500
@@ -44,6 +44,8 @@
 
 #define TRACE_BUFFER_FLAGS	(RB_FL_OVERWRITE)
 
+static struct ring_buffer *save_buff;
+
 /*
  * On boot up, the ring buffer is set to the minimum size, so that
  * we do not waste memory on systems that are not using tracing.
@@ -2710,6 +2712,8 @@ int tracer_init(struct tracer *t, struct
 	return t->init(tr);
 }
 
+static int saved;
+
 static int tracing_resize_ring_buffer(unsigned long size)
 {
 	int ret;
@@ -2752,11 +2756,82 @@ static int tracing_resize_ring_buffer(un
 		return ret;
 	}
 
+	if (!saved)
+		ring_buffer_resize(save_buff, size);
+
 	global_trace.entries = size;
 
 	return ret;
 }
 
+static arch_spinlock_t save_buf_lock =
+	(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+
+void sdr_save_trace(void)
+{
+	struct ring_buffer *buff;
+
+	arch_spin_lock(&save_buf_lock);
+	if (saved)
+		goto out;
+	trace_printk("DIED HERE (%s)\n", current_trace->name);
+	buff = global_trace.buffer;
+	global_trace.buffer = save_buff;
+	save_buff = buff;
+	tracing_off();
+	saved = 1;
+	dump_stack();
+	printk("buffer entries=%ld\n", ring_buffer_entries(save_buff));
+ out:
+	arch_spin_unlock(&save_buf_lock);
+}
+
+static void swap_buffers(void)
+{
+	struct ring_buffer *buff;
+
+	arch_spin_lock(&save_buf_lock);
+	printk("buffer entries=%ld\n", ring_buffer_entries(save_buff));
+	buff = global_trace.buffer;
+	global_trace.buffer = save_buff;
+	save_buff = buff;
+	arch_spin_unlock(&save_buf_lock);
+}
+
+
+static ssize_t
+tracing_swap_write(struct file *filp, const char __user *ubuf,
+		   size_t cnt, loff_t *ppos)
+{
+	swap_buffers();
+
+	saved = 0;
+
+	*ppos += cnt;
+
+	return cnt;
+}
+
+static ssize_t
+tracing_swap_read(struct file *filp, char __user *ubuf,
+		  size_t cnt, loff_t *ppos)
+{
+	char buf[64];
+	int r;
+
+	r = snprintf(buf, sizeof(buf), "%d\n", saved);
+	if (r > sizeof(buf))
+		r = sizeof(buf);
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
+static const struct file_operations tracing_swap_fops = {
+	.open		= tracing_open_generic,
+	.read		= tracing_swap_read,
+	.write		= tracing_swap_write,
+};
+
+
 /**
  * tracing_update_buffers - used by tracing facility to expand ring buffers
  *
@@ -4228,6 +4303,9 @@ static __init int tracer_init_debugfs(vo
 
 	d_tracer = tracing_init_dentry();
 
+	trace_create_file("swap_buffers", 0644, d_tracer,
+			&global_trace, &tracing_swap_fops);
+
 	trace_create_file("tracing_enabled", 0644, d_tracer,
 			&global_trace, &tracing_ctrl_fops);
 
@@ -4469,6 +4547,12 @@ __init static int tracer_alloc_buffers(v
 	cpumask_copy(tracing_buffer_mask, cpu_possible_mask);
 	cpumask_copy(tracing_cpumask, cpu_all_mask);
 
+	save_buff = ring_buffer_alloc(ring_buf_size, TRACE_BUFFER_FLAGS);
+	if (!save_buff) {
+		printk(KERN_ERR "failed to allocate!\n");
+		goto out_free_buffer_mask;
+	}
+
 	/* TODO: make the number of buffers hot pluggable with CPUS */
 	global_trace.buffer = ring_buffer_alloc(ring_buf_size,
 						   TRACE_BUFFER_FLAGS);



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-12 19:20                     ` Steven Rostedt
@ 2010-03-12 21:38                       ` Steven Rostedt
  0 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-12 21:38 UTC (permalink / raw)
  To: Li Zefan
  Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar,
	Lai Jiangshan

On Fri, 2010-03-12 at 14:20 -0500, Steven Rostedt wrote:
> On Tue, 2010-03-09 at 11:38 +0800, Li Zefan wrote:
> 
> > The segfault is from userspace. This is a segfault I just got:
> > 
> > ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]

I found a reproducer that triggers in seconds:

# echo 1 > options/userstacktrace
# echo 1 > events/lock/enable
# while do :; ls > /dev/null; done

Seems the triggering of the userspace stack trace may be messing with
dynamic linking. Note, this does not happen on all of my boxes. It
happens on fedora 11 but not on RHEL 5.1.

I'm still investigating.

-- Steve



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-09  3:38                   ` Li Zefan
  2010-03-12 19:20                     ` Steven Rostedt
@ 2010-03-13  2:07                     ` Steven Rostedt
  2010-03-13  5:37                       ` Américo Wang
  2010-03-13 17:56                       ` Steven Rostedt
  1 sibling, 2 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-13  2:07 UTC (permalink / raw)
  To: Li Zefan, Lai Jiangshan
  Cc: Frederic Weisbecker, Américo Wang, LKML, Ingo Molnar

On Tue, 2010-03-09 at 11:38 +0800, Li Zefan wrote:

> The segfault is from userspace. This is a segfault I just got:
> 
> ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]

Li and Lai,

Thanks a lot for reporting all this, and writing the ftrace_stress_test.
I've fixed all the bugs that it has reported (to my knowledge) and after
a little more testing I will soon be sending a patch series out.

Just wanted to say thanks again, your efforts have helped out
tremendously!

Americo,

I still do not know why the function graph tracer is making your system
unusable. I'll look more into that when I get a chance. So far, I have
not had any issues on any of my boxes.

-- Steve

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-13  2:07                     ` Steven Rostedt
@ 2010-03-13  5:37                       ` Américo Wang
  2010-03-13  5:51                         ` Steven Rostedt
  2010-03-14 16:55                         ` Steven Rostedt
  2010-03-13 17:56                       ` Steven Rostedt
  1 sibling, 2 replies; 37+ messages in thread
From: Américo Wang @ 2010-03-13  5:37 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Li Zefan, Lai Jiangshan, Frederic Weisbecker, Américo Wang,
	LKML, Ingo Molnar

On Fri, Mar 12, 2010 at 09:07:29PM -0500, Steven Rostedt wrote:
>
>Americo,
>
>I still do not know why the function graph tracer is making your system
>unusable. I'll look more into that when I get a chance. So far, I have
>not had any issues on any of my boxes.
>

Oh, in fact, I just tried 2 x86_64 machines, both of them have this
problem.

Sorry that I don't have much time to do bisect, the last kernel that
I tried is 2.6.32-rc6, it is bad too. I will try some older kernels
if I have time.

Thank you.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-13  5:37                       ` Américo Wang
@ 2010-03-13  5:51                         ` Steven Rostedt
  2010-03-14 16:55                         ` Steven Rostedt
  1 sibling, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-13  5:51 UTC (permalink / raw)
  To: Américo Wang
  Cc: Li Zefan, Lai Jiangshan, Frederic Weisbecker, LKML, Ingo Molnar

On Sat, 2010-03-13 at 13:37 +0800, Américo Wang wrote:
> On Fri, Mar 12, 2010 at 09:07:29PM -0500, Steven Rostedt wrote:
> >
> >Americo,
> >
> >I still do not know why the function graph tracer is making your system
> >unusable. I'll look more into that when I get a chance. So far, I have
> >not had any issues on any of my boxes.
> >
> 
> Oh, in fact, I just tried 2 x86_64 machines, both of them have this
> problem.
> 
> Sorry that I don't have much time to do bisect, the last kernel that
> I tried is 2.6.32-rc6, it is bad too. I will try some older kernels
> if I have time.

It may be a config problem. I'll analyze your config again and see if I
can find something with it. But this will need to wait till next week.

Thanks,

-- Steve


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-13  5:37                       ` Américo Wang
  2010-03-13  5:51                         ` Steven Rostedt
@ 2010-03-14 16:55                         ` Steven Rostedt
  1 sibling, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-14 16:55 UTC (permalink / raw)
  To: Américo Wang
  Cc: Li Zefan, Lai Jiangshan, Frederic Weisbecker, LKML, Ingo Molnar

On Sat, 2010-03-13 at 13:37 +0800, Américo Wang wrote:
> On Fri, Mar 12, 2010 at 09:07:29PM -0500, Steven Rostedt wrote:
> >
> >Americo,
> >
> >I still do not know why the function graph tracer is making your system
> >unusable. I'll look more into that when I get a chance. So far, I have
> >not had any issues on any of my boxes.
> >
> 
> Oh, in fact, I just tried 2 x86_64 machines, both of them have this
> problem.
> 
> Sorry that I don't have much time to do bisect, the last kernel that
> I tried is 2.6.32-rc6, it is bad too. I will try some older kernels
> if I have time.

It may be a config problem. I'll analyze your config again and see if I
can find something with it. But this will need to wait till next week.

Thanks,

-- Steve




^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-13  2:07                     ` Steven Rostedt
  2010-03-13  5:37                       ` Américo Wang
@ 2010-03-13 17:56                       ` Steven Rostedt
  2010-03-15  1:46                         ` Li Zefan
  1 sibling, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-13 17:56 UTC (permalink / raw)
  To: Li Zefan
  Cc: Lai Jiangshan, Frederic Weisbecker, Américo Wang, LKML,
	Ingo Molnar

On Fri, 2010-03-12 at 21:07 -0500, Steven Rostedt wrote:
> On Tue, 2010-03-09 at 11:38 +0800, Li Zefan wrote:
> 
> > The segfault is from userspace. This is a segfault I just got:
> > 
> > ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]
> 
> Li and Lai,
> 
> Thanks a lot for reporting all this, and writing the ftrace_stress_test.
> I've fixed all the bugs that it has reported (to my knowledge) and after
> a little more testing I will soon be sending a patch series out.

Li,

I ran your test for 13 hours on three different machines. I did not hit
any more errors. Although I see a lockdep warning (not a locking problem
but the WARN_ON in fork.c is triggering). I'll look into that on Monday.

Please take the new patches that Ingo just pushed to Linus, if you find
any more bugs, please let me know right away.

Thanks!

-- Steve



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-13 17:56                       ` Steven Rostedt
@ 2010-03-15  1:46                         ` Li Zefan
  2010-03-15 14:39                           ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Li Zefan @ 2010-03-15  1:46 UTC (permalink / raw)
  To: rostedt
  Cc: Lai Jiangshan, Frederic Weisbecker, Américo Wang, LKML,
	Ingo Molnar

Steven Rostedt wrote:
> On Fri, 2010-03-12 at 21:07 -0500, Steven Rostedt wrote:
>> On Tue, 2010-03-09 at 11:38 +0800, Li Zefan wrote:
>>
>>> The segfault is from userspace. This is a segfault I just got:
>>>
>>> ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]
>> Li and Lai,
>>
>> Thanks a lot for reporting all this, and writing the ftrace_stress_test.
>> I've fixed all the bugs that it has reported (to my knowledge) and after
>> a little more testing I will soon be sending a patch series out.
> 
> Li,
> 
> I ran your test for 13 hours on three different machines. I did not hit
> any more errors. Although I see a lockdep warning (not a locking problem
> but the WARN_ON in fork.c is triggering). I'll look into that on Monday.
> 

As I said in the offline mail, this warning is caused by global trace_clock,
and I sent a fix sometime ago:

http://lkml.org/lkml/2009/12/21/4

> Please take the new patches that Ingo just pushed to Linus, if you find
> any more bugs, please let me know right away.
> 

Thanks for the fixes!


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-15  1:46                         ` Li Zefan
@ 2010-03-15 14:39                           ` Steven Rostedt
  2010-03-15 16:16                             ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-15 14:39 UTC (permalink / raw)
  To: Li Zefan
  Cc: Lai Jiangshan, Frederic Weisbecker, Américo Wang, LKML,
	Ingo Molnar, Peter Zijlstra

On Mon, 2010-03-15 at 09:46 +0800, Li Zefan wrote:
> Steven Rostedt wrote:
> > On Fri, 2010-03-12 at 21:07 -0500, Steven Rostedt wrote:
> >> On Tue, 2010-03-09 at 11:38 +0800, Li Zefan wrote:
> >>
> >>> The segfault is from userspace. This is a segfault I just got:
> >>>
> >>> ftrace_stack_ma[14989]: segfault at fe2e ip 00000000004474c0 sp 00007fffb97e38c0 error 4 in bash[400000+d1000]
> >> Li and Lai,
> >>
> >> Thanks a lot for reporting all this, and writing the ftrace_stress_test.
> >> I've fixed all the bugs that it has reported (to my knowledge) and after
> >> a little more testing I will soon be sending a patch series out.
> > 
> > Li,
> > 
> > I ran your test for 13 hours on three different machines. I did not hit
> > any more errors. Although I see a lockdep warning (not a locking problem
> > but the WARN_ON in fork.c is triggering). I'll look into that on Monday.
> > 
> 
> As I said in the offline mail, this warning is caused by global trace_clock,
> and I sent a fix sometime ago:
> 
> http://lkml.org/lkml/2009/12/21/4

Ah, I was waiting for a final Acked-by: from Peter Zijlstra.

Peter,

Should I pull in that patch? I don't see anything wrong with converting
a raw_local_irq_save() to a local_irq_save() in the tracing code. I'll
apply it locally and see if it fixes the warn on that I see.

Note, even the function tracer uses straight local_irq_save(), so I
would think that the clock code could too.

-- Steve

> 
> > Please take the new patches that Ingo just pushed to Linus, if you find
> > any more bugs, please let me know right away.
> > 
> 
> Thanks for the fixes!
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-15 14:39                           ` Steven Rostedt
@ 2010-03-15 16:16                             ` Steven Rostedt
  0 siblings, 0 replies; 37+ messages in thread
From: Steven Rostedt @ 2010-03-15 16:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Lai Jiangshan, Frederic Weisbecker, Américo Wang, LKML,
	Ingo Molnar, Li Zefan

On Mon, 2010-03-15 at 10:39 -0400, Steven Rostedt wrote:

> > As I said in the offline mail, this warning is caused by global trace_clock,
> > and I sent a fix sometime ago:
> > 
> > http://lkml.org/lkml/2009/12/21/4
> 
> Ah, I was waiting for a final Acked-by: from Peter Zijlstra.
> 
> Peter,
> 
> Should I pull in that patch? I don't see anything wrong with converting
> a raw_local_irq_save() to a local_irq_save() in the tracing code. I'll
> apply it locally and see if it fixes the warn on that I see.

Peter,

I ran the stress test for an hour without any lockdep warnings with Li's
patch. Normally it would trigger in a second. Can I get your acked-by?

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-05 15:06       ` Steven Rostedt
  2010-03-05 17:16         ` Frederic Weisbecker
@ 2010-03-08  2:31         ` Américo Wang
  2010-03-09  2:40           ` Américo Wang
  1 sibling, 1 reply; 37+ messages in thread
From: Américo Wang @ 2010-03-08  2:31 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Fri, Mar 5, 2010 at 11:06 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 2010-03-05 at 15:16 +0800, Américo Wang wrote:
>> On Fri, Mar 5, 2010 at 12:14 PM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
>> > On Thu, Mar 4, 2010 at 9:54 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>> >> On Wed, 2010-03-03 at 14:04 +0800, Américo Wang wrote:
>> >>> I am not sure if this is ftrace's fault, but it is ftrace who triggers
>> >>> the soft lockup. On my machine, it is pretty easy, just run:
>> >>>
>> >>> echo function_graph > current_tracer
>> >>>
>
>> >
>> > I can't say that because I didn't try -rc6.
>> >
>>
>> Sigh, 2.6.33-rc6 doesn't work, even 2.6.32 doesn't work...
>
> So basically you are saying that the function_graph tracer, when enabled
> has a high overhead? Well, unfortunately, that's expected.


A unusable system is expected? ;-)
Here my machine is too slow to use.

>
> So my question to you is, have you seen the function graph perform
> better with the same configs in previous kernels? Also, the function
> graph makes other debugging (like lockdep) have a greater impact to
> performance than they usually do.


Oh, sorry, I still don't find a working kernel, I will try today.

>
> Now some things you can do to help performance. One is not to trace
> functions that are known to have a high hit rate. You can do this with
> the set_ftrace_notrace file, or add "ftrace_notrace=func1,func2,func3"
> to the command line where func1,func2,func3 are the functions you do not
> want to trace. This just adds these by default to the set_ftrace_notrace
> and can be removed at runtime.
>
>
> The functions I commonly write to are:
>
> echo '*spin_lock*' '*spin_unlock*' '*spin_try*' '*rcu_read*' > set_ftace_notrace
>
> since these functions are hit quite intensively, by not tracing them it
> helps a bit with performance.

I will try this now.

Thanks!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-08  2:31         ` Américo Wang
@ 2010-03-09  2:40           ` Américo Wang
  2010-03-09  2:51             ` Steven Rostedt
  0 siblings, 1 reply; 37+ messages in thread
From: Américo Wang @ 2010-03-09  2:40 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Mon, Mar 8, 2010 at 10:31 AM, Américo Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Mar 5, 2010 at 11:06 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>> Now some things you can do to help performance. One is not to trace
>> functions that are known to have a high hit rate. You can do this with
>> the set_ftrace_notrace file, or add "ftrace_notrace=func1,func2,func3"
>> to the command line where func1,func2,func3 are the functions you do not
>> want to trace. This just adds these by default to the set_ftrace_notrace
>> and can be removed at runtime.
>>
>>
>> The functions I commonly write to are:
>>
>> echo '*spin_lock*' '*spin_unlock*' '*spin_try*' '*rcu_read*' > set_ftace_notrace
>>
>> since these functions are hit quite intensively, by not tracing them it
>> helps a bit with performance.
>
> I will try this now.
>

Unfortunately, this doesn't help.

What is worse, I got another soft lockup, although I already turned LOCKDEP off.


Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-09  2:40           ` Américo Wang
@ 2010-03-09  2:51             ` Steven Rostedt
  2010-03-09  5:06               ` Américo Wang
  0 siblings, 1 reply; 37+ messages in thread
From: Steven Rostedt @ 2010-03-09  2:51 UTC (permalink / raw)
  To: Américo Wang; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Tue, 2010-03-09 at 10:40 +0800, Américo Wang wrote:

> > I will try this now.
> >
> 
> Unfortunately, this doesn't help.
> 
> What is worse, I got another soft lockup, although I already turned LOCKDEP off.

Hmm, this really sounds like somethings badly broken.

What version of GCC are you using?

-- Steve



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: 2.6.33: ftrace triggers soft lockup
  2010-03-09  2:51             ` Steven Rostedt
@ 2010-03-09  5:06               ` Américo Wang
  0 siblings, 0 replies; 37+ messages in thread
From: Américo Wang @ 2010-03-09  5:06 UTC (permalink / raw)
  To: rostedt; +Cc: LKML, Frederic Weisbecker, Ingo Molnar

On Tue, Mar 9, 2010 at 10:51 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 2010-03-09 at 10:40 +0800, Américo Wang wrote:
>
>> > I will try this now.
>> >
>>
>> Unfortunately, this doesn't help.
>>
>> What is worse, I got another soft lockup, although I already turned LOCKDEP off.
>
> Hmm, this really sounds like somethings badly broken.
>
> What version of GCC are you using?

# gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-libgcj-multifile
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre
--with-cpu=generic --host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2010-03-15 16:16 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-03  6:04 2.6.33: ftrace triggers soft lockup Américo Wang
2010-03-03 13:05 ` Steven Rostedt
2010-03-04  2:58   ` Américo Wang
2010-03-04  1:33 ` Li Zefan
2010-03-04  1:46 ` Frederic Weisbecker
2010-03-04  3:01   ` Américo Wang
2010-03-04  3:18     ` Steven Rostedt
2010-03-04  4:06       ` Frederic Weisbecker
2010-03-04  5:10       ` Américo Wang
2010-03-04  5:18         ` Frederic Weisbecker
2010-03-04 13:54 ` Steven Rostedt
2010-03-05  4:14   ` Américo Wang
2010-03-05  7:16     ` Américo Wang
2010-03-05 15:06       ` Steven Rostedt
2010-03-05 17:16         ` Frederic Weisbecker
2010-03-05 18:35           ` Steven Rostedt
2010-03-05 18:43             ` Frederic Weisbecker
2010-03-05 19:07               ` Steven Rostedt
2010-03-08  7:32           ` Li Zefan
2010-03-08 15:15             ` Steven Rostedt
2010-03-09  2:11               ` Li Zefan
2010-03-09  2:23                 ` Steven Rostedt
2010-03-09  3:38                   ` Li Zefan
2010-03-12 19:20                     ` Steven Rostedt
2010-03-12 21:38                       ` Steven Rostedt
2010-03-13  2:07                     ` Steven Rostedt
2010-03-13  5:37                       ` Américo Wang
2010-03-13  5:51                         ` Steven Rostedt
2010-03-14 16:55                         ` Steven Rostedt
2010-03-13 17:56                       ` Steven Rostedt
2010-03-15  1:46                         ` Li Zefan
2010-03-15 14:39                           ` Steven Rostedt
2010-03-15 16:16                             ` Steven Rostedt
2010-03-08  2:31         ` Américo Wang
2010-03-09  2:40           ` Américo Wang
2010-03-09  2:51             ` Steven Rostedt
2010-03-09  5:06               ` Américo Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox