From: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
To: Srivatsa Vaddagiri
<vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Balbir Singh <balbir-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org>,
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
Peter Zijlstra
<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>
Cc: iranna.ankad-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org
Subject: Re: [Bugme-new] [Bug 14150] New: BUG: soft lockup - CPU#3 stuck for 61s!, while running cpu controller latency testcase on two containers parallaly
Date: Fri, 11 Sep 2009 14:28:13 -0700 [thread overview]
Message-ID: <20090911142813.dc823bf1.akpm@linux-foundation.org> (raw)
In-Reply-To: <bug-14150-10286-V0hAGp6uBxO456/isadD/XN4h3HLQggn@public.gmane.org/>
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Thu, 10 Sep 2009 09:32:30 GMT
bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14150
>
> Summary: BUG: soft lockup - CPU#3 stuck for 61s!, while running
> cpu controller latency testcase on two containers
> parallaly
> Product: Process Management
> Version: 2.5
> Kernel Version: 2.6.31-rc7
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: high
> Priority: P1
> Component: Scheduler
> AssignedTo: mingo-X9Un+BFzKDI@public.gmane.org
> ReportedBy: risrajak-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org
> CC: serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org, iranna.ankad-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org,
> risrajak-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org
> Regression: No
>
>
> Created an attachment (id=23055)
> --> (http://bugzilla.kernel.org/attachment.cgi?id=23055)
> Config-file-used
>
> Hitting this soft lock issue while running this scenario on 2.6.31-rc7 kernel
> on SystemX 32 bit on multiple machines.
>
> Scenario:
> - While running cpu controller latency testcase from LTP same time on two
> containers.
>
> Steps:
> 1. Compile ltp-full-20090731.tgz on host.
> 2. Create two container (Used lxc tool
> (http://sourceforge.net/projects/lxc/lxc-0.6.3.tar.gz) for creating container )
> e.g:
> lxc-create -n foo1
> lxc-create -n foo2
> On first shell:
> lxc-execute -n foo1 -f /usr/etc/lxc/lxc-macvlan.conf /bin/bash
> on Second shell:
> lxc-execute -n foo2 -f /usr/etc/lxc/lxc-macvlan.conf /bin/bash
>
> 3. Either you run cpu_latency testcase alone or run "./runltp -f controllers"
> at same time on both the containers.
> 4. After testcase execution completes, you can see this message in dmesg.
>
> Expected Result:
> - Should not reproduce soft lock up issue.
> - This reproduces 3 times out of 5 tries.
>
> hrtimer: interrupt too slow, forcing clock min delta to 5843235 ns
> hrtimer: interrupt too slow, forcing clock min delta to 5842476 ns
> Clocksource tsc unstable (delta = 18749057581 ns)
> BUG: soft lockup - CPU#3 stuck for 61s! [cpuctl_latency_:17174]
> Modules linked in: bridge stp llc bnep sco l2cap bluetooth sunrpc ipv6
> p4_clockmod dm_multipath uinput qla2xxx ata_generic pata_acpi usb_storage e1000
> scsi_transport_fc joydev scsi_tgt i2c_piix4 pata_serverworks pcspkr serio_raw
> mptspi mptscsih mptbase scsi_transport_spi radeon ttm drm i2c_algo_bit i2c_core
> [last unloaded: scsi_wait_scan]
>
> Pid: 17174, comm: cpuctl_latency_ Tainted: G W (2.6.31-rc7 #1) IBM
> eServer BladeCenter HS40 -[883961X]-
> EIP: 0060:[<c058aded>] EFLAGS: 00000283 CPU: 3
> EIP is at find_next_bit+0x9/0x79
> EAX: c2c437a0 EBX: f3d433c0 ECX: 00000000 EDX: 00000020
> ESI: c2c436bc EDI: 00000000 EBP: f063be6c ESP: f063be64
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> CR0: 80050033 CR2: 008765a4 CR3: 314d7000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> Call Trace:
> [<c0427b6e>] cpumask_next+0x17/0x19
> [<c042c28d>] tg_shares_up+0x53/0x149
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c042406e>] walk_tg_tree+0x63/0x77
> [<c042c23a>] ? tg_shares_up+0x0/0x149
> [<c042e836>] update_shares+0x5d/0x65
> [<c0432af3>] rebalance_domains+0x114/0x460
> [<c0403393>] ? restore_all_notrace+0x0/0x18
> [<c0432e75>] run_rebalance_domains+0x36/0xa3
> [<c043c324>] __do_softirq+0xbc/0x173
> [<c043c416>] do_softirq+0x3b/0x5f
> [<c043c52d>] irq_exit+0x3a/0x68
> [<c0417846>] smp_apic_timer_interrupt+0x6d/0x7b
> [<c0403c9b>] apic_timer_interrupt+0x2f/0x34
> BUG: soft lockup - CPU#2 stuck for 61s! [watchdog/2:11]
> Modules linked in: bridge stp llc bnep sco l2cap bluetooth sunrpc ipv6
> p4_clockmod dm_multipath uinput qla2xxx ata_generic pata_acpi usb_storage e1000
> scsi_transport_fc joydev scsi_tgt i2c_piix4 pata_serverworks pcspkr serio_raw
> mptspi mptscsih mptbase scsi_transport_spi radeon ttm drm i2c_algo_bit i2c_core
> [last unloaded: scsi_wait_scan]
>
> Pid: 11, comm: watchdog/2 Tainted: G W (2.6.31-rc7 #1) IBM eServer
> BladeCenter HS40 -[883961X]-
> EIP: 0060:[<c042c313>] EFLAGS: 00000246 CPU: 2
> EIP is at tg_shares_up+0xd9/0x149
> EAX: 00000000 EBX: f09b3c00 ECX: f0baac00 EDX: 00000100
> ESI: 00000002 EDI: 00000400 EBP: f6cb7de0 ESP: f6cb7db8
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 08070680 CR3: 009c8000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> Call Trace:
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c042406e>] walk_tg_tree+0x63/0x77
> [<c042c23a>] ? tg_shares_up+0x0/0x149
> [<c042e836>] update_shares+0x5d/0x65
> [<c0432af3>] rebalance_domains+0x114/0x460
> [<c0432e75>] run_rebalance_domains+0x36/0xa3
> [<c043c324>] __do_softirq+0xbc/0x173
> [<c043c416>] do_softirq+0x3b/0x5f
> [<c043c52d>] irq_exit+0x3a/0x68
> [<c0417846>] smp_apic_timer_interrupt+0x6d/0x7b
> [<c0403c9b>] apic_timer_interrupt+0x2f/0x34
> [<c0430d37>] ? finish_task_switch+0x5d/0xc4
> [<c0744b11>] schedule+0x74c/0x7b2
> [<c0590e28>] ? trace_hardirqs_on_thunk+0xc/0x10
> [<c0403393>] ? restore_all_notrace+0x0/0x18
> [<c0471e19>] ? watchdog+0x0/0x79
> [<c0471e19>] ? watchdog+0x0/0x79
> [<c0471e63>] watchdog+0x4a/0x79
> [<c0449a53>] kthread+0x70/0x75
> [<c04499e3>] ? kthread+0x0/0x75
> [<c0403e93>] kernel_thread_helper+0x7/0x10
> [root@hs40 ltp-full-20090731]# uname -a
> Linux hs40.in.ibm.com 2.6.31-rc7 #1 SMP Thu Sep 3 10:14:41 IST 2009 i686 i686
> i386 GNU/Linux
> [root@hs40 ltp-full-20090731]#
>
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>,
Balbir Singh <balbir@in.ibm.com>, Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: bugzilla-daemon@bugzilla.kernel.org,
bugme-daemon@bugzilla.kernel.org,
"Serge E. Hallyn" <serue@us.ibm.com>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, risrajak@linux.vnet.ibm.com,
iranna.ankad@in.ibm.com
Subject: Re: [Bugme-new] [Bug 14150] New: BUG: soft lockup - CPU#3 stuck for 61s!, while running cpu controller latency testcase on two containers parallaly
Date: Fri, 11 Sep 2009 14:28:13 -0700 [thread overview]
Message-ID: <20090911142813.dc823bf1.akpm@linux-foundation.org> (raw)
In-Reply-To: <bug-14150-10286@http.bugzilla.kernel.org/>
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Thu, 10 Sep 2009 09:32:30 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14150
>
> Summary: BUG: soft lockup - CPU#3 stuck for 61s!, while running
> cpu controller latency testcase on two containers
> parallaly
> Product: Process Management
> Version: 2.5
> Kernel Version: 2.6.31-rc7
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: high
> Priority: P1
> Component: Scheduler
> AssignedTo: mingo@elte.hu
> ReportedBy: risrajak@linux.vnet.ibm.com
> CC: serue@us.ibm.com, iranna.ankad@in.ibm.com,
> risrajak@in.ibm.com
> Regression: No
>
>
> Created an attachment (id=23055)
> --> (http://bugzilla.kernel.org/attachment.cgi?id=23055)
> Config-file-used
>
> Hitting this soft lock issue while running this scenario on 2.6.31-rc7 kernel
> on SystemX 32 bit on multiple machines.
>
> Scenario:
> - While running cpu controller latency testcase from LTP same time on two
> containers.
>
> Steps:
> 1. Compile ltp-full-20090731.tgz on host.
> 2. Create two container (Used lxc tool
> (http://sourceforge.net/projects/lxc/lxc-0.6.3.tar.gz) for creating container )
> e.g:
> lxc-create -n foo1
> lxc-create -n foo2
> On first shell:
> lxc-execute -n foo1 -f /usr/etc/lxc/lxc-macvlan.conf /bin/bash
> on Second shell:
> lxc-execute -n foo2 -f /usr/etc/lxc/lxc-macvlan.conf /bin/bash
>
> 3. Either you run cpu_latency testcase alone or run "./runltp -f controllers"
> at same time on both the containers.
> 4. After testcase execution completes, you can see this message in dmesg.
>
> Expected Result:
> - Should not reproduce soft lock up issue.
> - This reproduces 3 times out of 5 tries.
>
> hrtimer: interrupt too slow, forcing clock min delta to 5843235 ns
> hrtimer: interrupt too slow, forcing clock min delta to 5842476 ns
> Clocksource tsc unstable (delta = 18749057581 ns)
> BUG: soft lockup - CPU#3 stuck for 61s! [cpuctl_latency_:17174]
> Modules linked in: bridge stp llc bnep sco l2cap bluetooth sunrpc ipv6
> p4_clockmod dm_multipath uinput qla2xxx ata_generic pata_acpi usb_storage e1000
> scsi_transport_fc joydev scsi_tgt i2c_piix4 pata_serverworks pcspkr serio_raw
> mptspi mptscsih mptbase scsi_transport_spi radeon ttm drm i2c_algo_bit i2c_core
> [last unloaded: scsi_wait_scan]
>
> Pid: 17174, comm: cpuctl_latency_ Tainted: G W (2.6.31-rc7 #1) IBM
> eServer BladeCenter HS40 -[883961X]-
> EIP: 0060:[<c058aded>] EFLAGS: 00000283 CPU: 3
> EIP is at find_next_bit+0x9/0x79
> EAX: c2c437a0 EBX: f3d433c0 ECX: 00000000 EDX: 00000020
> ESI: c2c436bc EDI: 00000000 EBP: f063be6c ESP: f063be64
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> CR0: 80050033 CR2: 008765a4 CR3: 314d7000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> Call Trace:
> [<c0427b6e>] cpumask_next+0x17/0x19
> [<c042c28d>] tg_shares_up+0x53/0x149
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c042406e>] walk_tg_tree+0x63/0x77
> [<c042c23a>] ? tg_shares_up+0x0/0x149
> [<c042e836>] update_shares+0x5d/0x65
> [<c0432af3>] rebalance_domains+0x114/0x460
> [<c0403393>] ? restore_all_notrace+0x0/0x18
> [<c0432e75>] run_rebalance_domains+0x36/0xa3
> [<c043c324>] __do_softirq+0xbc/0x173
> [<c043c416>] do_softirq+0x3b/0x5f
> [<c043c52d>] irq_exit+0x3a/0x68
> [<c0417846>] smp_apic_timer_interrupt+0x6d/0x7b
> [<c0403c9b>] apic_timer_interrupt+0x2f/0x34
> BUG: soft lockup - CPU#2 stuck for 61s! [watchdog/2:11]
> Modules linked in: bridge stp llc bnep sco l2cap bluetooth sunrpc ipv6
> p4_clockmod dm_multipath uinput qla2xxx ata_generic pata_acpi usb_storage e1000
> scsi_transport_fc joydev scsi_tgt i2c_piix4 pata_serverworks pcspkr serio_raw
> mptspi mptscsih mptbase scsi_transport_spi radeon ttm drm i2c_algo_bit i2c_core
> [last unloaded: scsi_wait_scan]
>
> Pid: 11, comm: watchdog/2 Tainted: G W (2.6.31-rc7 #1) IBM eServer
> BladeCenter HS40 -[883961X]-
> EIP: 0060:[<c042c313>] EFLAGS: 00000246 CPU: 2
> EIP is at tg_shares_up+0xd9/0x149
> EAX: 00000000 EBX: f09b3c00 ECX: f0baac00 EDX: 00000100
> ESI: 00000002 EDI: 00000400 EBP: f6cb7de0 ESP: f6cb7db8
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: 08070680 CR3: 009c8000 CR4: 000006d0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> Call Trace:
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c0424082>] ? tg_nop+0x0/0xc
> [<c042406e>] walk_tg_tree+0x63/0x77
> [<c042c23a>] ? tg_shares_up+0x0/0x149
> [<c042e836>] update_shares+0x5d/0x65
> [<c0432af3>] rebalance_domains+0x114/0x460
> [<c0432e75>] run_rebalance_domains+0x36/0xa3
> [<c043c324>] __do_softirq+0xbc/0x173
> [<c043c416>] do_softirq+0x3b/0x5f
> [<c043c52d>] irq_exit+0x3a/0x68
> [<c0417846>] smp_apic_timer_interrupt+0x6d/0x7b
> [<c0403c9b>] apic_timer_interrupt+0x2f/0x34
> [<c0430d37>] ? finish_task_switch+0x5d/0xc4
> [<c0744b11>] schedule+0x74c/0x7b2
> [<c0590e28>] ? trace_hardirqs_on_thunk+0xc/0x10
> [<c0403393>] ? restore_all_notrace+0x0/0x18
> [<c0471e19>] ? watchdog+0x0/0x79
> [<c0471e19>] ? watchdog+0x0/0x79
> [<c0471e63>] watchdog+0x4a/0x79
> [<c0449a53>] kthread+0x70/0x75
> [<c04499e3>] ? kthread+0x0/0x75
> [<c0403e93>] kernel_thread_helper+0x7/0x10
> [root@hs40 ltp-full-20090731]# uname -a
> Linux hs40.in.ibm.com 2.6.31-rc7 #1 SMP Thu Sep 3 10:14:41 IST 2009 i686 i686
> i386 GNU/Linux
> [root@hs40 ltp-full-20090731]#
>
next parent reply other threads:[~2009-09-11 21:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-14150-10286@http.bugzilla.kernel.org/>
[not found] ` <bug-14150-10286-V0hAGp6uBxO456/isadD/XN4h3HLQggn@public.gmane.org/>
2009-09-11 21:28 ` Andrew Morton [this message]
2009-09-11 21:28 ` [Bugme-new] [Bug 14150] New: BUG: soft lockup - CPU#3 stuck for 61s!, while running cpu controller latency testcase on two containers parallaly Andrew Morton
2009-09-15 12:58 ` Dhaval Giani
[not found] ` <20090915125836.GA4087-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-09-17 9:25 ` Rishikesh
2009-09-17 9:25 ` Rishikesh
2009-09-18 6:54 ` Rishikesh
[not found] ` <4AB32EA7.20904-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-09-18 8:26 ` Rishikesh
2009-09-18 8:26 ` Rishikesh
[not found] ` <4AB34445.4060700-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-09-22 10:09 ` Rishikesh
2009-09-22 10:09 ` Rishikesh
2009-11-13 17:33 ` Dhaval Giani
[not found] ` <4AB8A24E.8060209-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-11-13 17:33 ` Dhaval Giani
[not found] ` <4AB20086.8070105-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-09-18 6:54 ` Rishikesh
[not found] ` <20090911142813.dc823bf1.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-09-15 12:58 ` Dhaval Giani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090911142813.dc823bf1.akpm@linux-foundation.org \
--to=akpm-de/tnxtf+jlsfhdxvbkv3wd2fqjk+8+b@public.gmane.org \
--cc=a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org \
--cc=balbir-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org \
--cc=bugme-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org \
--cc=bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=iranna.ankad-xthvdsQ13ZrQT0dZR+AlfA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=mingo-X9Un+BFzKDI@public.gmane.org \
--cc=vatsa-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.