From: Lai Jiangshan <laijs@cn.fujitsu.com>
To: tglx@linutronix.de, Ingo Molnar <mingo@elte.hu>,
Andrew Morton <akpm@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: [BUG] hotplug_cpu vs no_hz
Date: Thu, 03 Jul 2008 14:35:53 +0800 [thread overview]
Message-ID: <486C7349.9010707@cn.fujitsu.com> (raw)
config:
CONFIG_DETECT_SOFTLOCKUP=y # just for call trace
CONFIG_HOTPLUG_CPU=y
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_NO_HZ=y # if =n, this 2 bugs can't occur
this 2 bugs occur on 2kernel_vr * 2platform
platform : i386 2cpus
x86_64 2core*2cpus
kernel_vr: 2.6.25
2.6.26-rc8
test 1:(cpu dies)
offline the other cpus, just left cpu#0 cpu#1, and:
i=0
while ((++i))
do
echo 0 > /sys/devices/system/cpu/cpu1/online
sleep 1
echo 1 > /sys/devices/system/cpu/cpu1/online
sleep 1
echo $i
done
after several seconds ~ several hours, "echo 1 > /sys/devices/system/cpu/cpu1/online"
was blocked, cpu#1 can not be used and the output of dmesg:
BUG: soft lockup - CPU#1 stuck for 61s! [events/1:9898]
CPU 1:
Modules linked in:
Pid: 9898, comm: events/1 Not tainted 2.6.26-rc8-official-LAI-00089-ge1441b9 #5
RIP: 0010:[<ffffffff80237612>] [<ffffffff80237612>] __do_softirq+0x4b/0xc7
RSP: 0018:ffff81006b42ff20 EFLAGS: 00000206
RAX: ffff81006a9b9fd8 RBX: ffff81006b42ff40 RCX: 0000000000000006
RDX: 0000000000000042 RSI: ffffffff8022da16 RDI: ffffffff8022da16
RBP: ffff81006b42fea0 R08: ffff81007f2c9178 R09: ffff81007f2c9140
R10: ffff8100807cc000 R11: 0000000000000000 R12: ffffffff8020be36
R13: ffff81006b42fea0 R14: ffffffff807a5100 R15: 0000000000000042
FS: 0000000000000000(0000) GS:ffff81007fb3ccc0(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00007f4cc97b6000 CR3: 0000000000201000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
<IRQ> [<ffffffff8020c38c>] ? call_softirq+0x1c/0x28
[<ffffffff8020dad6>] ? do_softirq+0x34/0x72
[<ffffffff80237586>] ? irq_exit+0x3f/0x80
[<ffffffff8021b128>] ? smp_apic_timer_interrupt+0x8b/0xa7
[<ffffffff8020be36>] ? apic_timer_interrupt+0x66/0x70
<EOI> [<ffffffff8022da16>] ? finish_task_switch+0x31/0x82
[<ffffffff80590e7d>] ? thread_return+0x3d/0x9c
[<ffffffff80241ca1>] ? worker_thread+0xa3/0xe5
[<ffffffff80244780>] ? autoremove_wake_function+0x0/0x38
[<ffffffff80241bfe>] ? worker_thread+0x0/0xe5
[<ffffffff80244645>] ? kthread+0x49/0x78
[<ffffffff8020c018>] ? child_rip+0xa/0x12
[<ffffffff802445fc>] ? kthread+0x0/0x78
[<ffffffff8020c00e>] ? child_rip+0x0/0x12
INFO: task syslogd:3835 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syslogd D ffffffff805af600 0 3835 1
ffff81007b80fd28 0000000000000082 0000000000000000 ffff81007dd83ec0
ffff81007f01a820 ffff81007fbb1560 ffff81007f01ab78 0000000100000001
00000000ffffffff 0000000000000292 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8030f58f>] log_wait_commit+0xa4/0xf4
[<ffffffff80244780>] ? autoremove_wake_function+0x0/0x38
[<ffffffff8030b1e7>] journal_stop+0x17c/0x1a9
[<ffffffff8030ba16>] journal_force_commit+0x23/0x25
[<ffffffff803049d0>] ext3_force_commit+0x26/0x28
[<ffffffff802fed64>] ext3_write_inode+0x39/0x3f
[<ffffffff802ad9ad>] __writeback_single_inode+0x180/0x29f
[<ffffffff802ae353>] sync_inode+0x24/0x31
[<ffffffff802fb2ff>] ext3_sync_file+0xa3/0xb4
[<ffffffff802b08f7>] do_fsync+0x54/0xaa
[<ffffffff802b097b>] __do_fsync+0x2e/0x44
[<ffffffff802b09ac>] sys_fsync+0xb/0xd
[<ffffffff8020b1fb>] system_call_after_swapgs+0x7b/0x80
test 2:(time-subsystem was broken)
offline the other cpus, just left cpu0#0 cpu#1, and:
try several times: {
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu1/online
cat /dev/zero > /dev/null &
taskset -p 2 $! # set affinity to cpu#1
top # get cpu usage of "cat /dev/zero"
# if cpu usage=0%, bug of test 1 have occurred
# stop test
# if cpu usage>150%, time-subsystem was broken
# offline/online again, "top" shows huger cpu usage
# nothing happen, kill "cat /dev/zero" and try again
}
next reply other threads:[~2008-07-03 7:14 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-03 6:35 Lai Jiangshan [this message]
2008-07-03 9:07 ` [BUG] hotplug_cpu vs no_hz Vegard Nossum
2008-07-05 5:23 ` Lai Jiangshan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=486C7349.9010707@cn.fujitsu.com \
--to=laijs@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.