From: Wu Fengguang <fengguang.wu@intel.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>, Peter Zijlstra <peterz@infradead.org>
Subject: Re: [BUG 2.6.27-rc1] find_busiest_group() LOCKUP
Date: Thu, 11 Nov 2010 18:09:21 +0800 [thread overview]
Message-ID: <20101111100921.GA25587@localhost> (raw)
In-Reply-To: <20101111100628.GA24728@localhost>
On Thu, Nov 11, 2010 at 06:06:28PM +0800, Wu Fengguang wrote:
> Greetings,
>
> I run into this kernel panic since 2.6.27-rc1. 2.6.36 boots OK.
> It's not yet fixed in 2.6.37-rc1-next-20101110. I can conveniently
> test any debug patches.
>
> Thanks,
> Fengguang
> ---
>
> 2.6.37-rc1-next-20101110 boot log
2.6.37-rc1 boot log, almost the same but stuck in find_next_bit():
[ 0.000000] console [ttyS0] enabled, bootconsole disabled
[ 0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[ 0.000000] ... MAX_LOCKDEP_SUBCLASSES: 8
[ 0.000000] ... MAX_LOCK_DEPTH: 48
[ 0.000000] ... MAX_LOCKDEP_KEYS: 8191
[ 0.000000] ... CLASSHASH_SIZE: 4096
[ 0.000000] ... MAX_LOCKDEP_ENTRIES: 16384
[ 0.000000] ... MAX_LOCKDEP_CHAINS: 32768
[ 0.000000] ... CHAINHASH_SIZE: 16384
[ 0.000000] memory used by lock dependency info: 6367 kB
[ 0.000000] per task-struct memory footprint: 2688 bytes
[ 0.000000] allocated 62914560 bytes of page_cgroup
[ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[ 0.000000] Fast TSC calibration using PIT
[ 0.004000] Detected 2666.516 MHz processor.
[ 0.000028] Calibrating delay loop (skipped), value calculated using timer frequency.. 5333.03 BogoMIPS (lpj=10666064)
[ 0.010995] pid_max: default: 32768 minimum: 301
[ 0.018236] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
[ 0.028644] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
[ 0.036764] Mount-cache hash table entries: 256
[ 0.042487] Initializing cgroup subsys debug
[ 0.046892] Initializing cgroup subsys ns
[ 0.051030] ns_cgroup deprecated: consider using the 'clone_children' flag without the ns_cgroup.
[ 0.060093] Initializing cgroup subsys cpuacct
[ 0.064674] Initializing cgroup subsys memory
[ 0.069234] Initializing cgroup subsys devices
[ 0.073811] Initializing cgroup subsys freezer
[ 0.078386] Initializing cgroup subsys blkio
[ 0.082905] CPU: Physical Processor ID: 0
[ 0.087044] CPU: Processor Core ID: 0
[ 0.090840] mce: CPU supports 9 MCE banks
[ 0.094988] CPU0: Thermal monitoring enabled (TM1)
[ 0.099921] using mwait in idle threads.
[ 0.103969] Performance Events: PEBS fmt1+, Nehalem events, Intel PMU driver.
[ 0.111449] ... version: 3
[ 0.115583] ... bit width: 48
[ 0.119802] ... generic registers: 4
[ 0.123937] ... value mask: 0000ffffffffffff
[ 0.129373] ... max period: 000000007fffffff
[ 0.134816] ... fixed-purpose events: 3
[ 0.138957] ... event mask: 000000070000000f
[ 0.145671] ACPI: Core revision 20101013
[ 0.171011] ftrace: allocating 29456 entries in 116 pages
[ 0.185896] Setting APIC routing to flat
[ 0.190577] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.236384] CPU0: Genuine Intel(R) CPU 000 @ 2.67GHz stepping 04
[ 0.349319] lockdep: fixing up alternatives.
[ 0.353960] Booting Node 0, Processors #1lockdep: fixing up alternatives.
[ 0.472080] #2lockdep: fixing up alternatives.
[ 0.588082] #3lockdep: fixing up alternatives.
[ 0.704042] #4lockdep: fixing up alternatives.
[ 0.820145] Ok.
[ 0.822112] Booting Node 1, Processors #5lockdep: fixing up alternatives.
[ 0.940140] Ok.
[ 0.942107] Booting Node 0, Processors #6lockdep: fixing up alternatives.
[ 1.060128] Ok.
[ 1.062100] Booting Node 1, Processors #7 Ok.
[ 1.176824] Brought up 8 CPUs
[ 1.179908] Total of 8 processors activated (42666.32 BogoMIPS).
[ 1.186105] Testing NMI watchdog ... OK.
[ 6.770490] BUG: NMI Watchdog detected LOCKUP on CPU0, ip ffffffff815854e7, registers:
[ 6.778665] CPU 0
[ 6.780556] Modules linked in:
[ 6.784094]
[ 6.785702] Pid: 1, comm: swapper Not tainted 2.6.37-rc1 #10 X8DTN/X8DTN
[ 6.792523] RIP: 0010:[<ffffffff815854e7>] [<ffffffff815854e7>] find_next_bit+0x117/0x160
[ 6.801043] RSP: 0018:ffff8801b9687870 EFLAGS: 00000006
[ 6.806475] RAX: 0000000000000008 RBX: ffff8800bac0e410 RCX: 0000000000000000
[ 6.813724] RDX: 0000000000000008 RSI: 0000000000000008 RDI: ffff8800bac0e410
[ 6.820977] RBP: ffff8801b9687870 R08: 0000000000000000 R09: 00000000001d2c80
[ 6.828232] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ba40de48
[ 6.835485] R13: ffff8801b9687b0c R14: 0000000000000000 R15: 00000000001d2c80
[ 6.842740] FS: 0000000000000000(0000) GS:ffff8800ba400000(0000) knlGS:0000000000000000
[ 6.851015] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6.856873] CR2: 0000000000000000 CR3: 0000000002041000 CR4: 00000000000006f0
[ 6.864121] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 6.871375] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 6.878630] Process swapper (pid: 1, threadinfo ffff8801b9686000, task ffff8800b3398000)
[ 6.886904] Stack:
[ 6.889029] ffff8801b9687890 ffffffff81584d99 0000000000000007 00000000001d2c80
[ 6.896861] ffff8801b9687a40 ffffffff810a9fca 0000000000000001 ffff8801b96879e0
[ 6.904696] ffff8801b96879b0 ffff8801bfdd2c80 0000000000000007 0000000000000000
[ 6.912530] Call Trace:
[ 6.915094] [<ffffffff81584d99>] cpumask_next_and+0x39/0x80
[ 6.920873] [<ffffffff810a9fca>] find_busiest_group+0x24a/0x1200
[ 6.927087] [<ffffffff810b08cf>] load_balance+0xdf/0xa60
[ 6.932608] [<ffffffff81c49b13>] ? schedule+0xdb3/0xee0
[ 6.938040] [<ffffffff81c49c29>] schedule+0xec9/0xee0
[ 6.943293] [<ffffffff81c4a69c>] schedule_timeout+0x30c/0x450
[ 6.949246] [<ffffffff81106a6b>] ? trace_hardirqs_off+0x1b/0x30
[ 6.955367] [<ffffffff810f606d>] ? local_clock+0x9d/0xb0
[ 6.960888] [<ffffffff81c4f0bc>] ? _raw_spin_unlock_irq+0x4c/0x70
[ 6.967188] [<ffffffff81c49fc5>] wait_for_common+0x185/0x220
[ 6.973055] [<ffffffff810b2250>] ? default_wake_function+0x0/0x30
[ 6.979349] [<ffffffff81c4a1b4>] wait_for_completion+0x24/0x30
[ 6.985388] [<ffffffff810eba42>] kthread_create+0xc2/0x160
[ 6.991075] [<ffffffff810e3c40>] ? rescuer_thread+0x0/0x2a0
[ 6.996856] [<ffffffff810a17cf>] ? complete+0x2f/0x80
[ 7.002115] [<ffffffff8110a35b>] ? trace_hardirqs_on+0x1b/0x30
[ 7.008152] [<ffffffff81212b50>] ? kmem_cache_alloc_notrace+0x160/0x1c0
[ 7.014971] [<ffffffff810e37d5>] __alloc_workqueue_key+0x465/0x8d0
[ 7.021358] [<ffffffff823c0a21>] cpuset_init_smp+0x5d/0x82
[ 7.027052] [<ffffffff8239f3fe>] kernel_init+0x1e7/0x337
[ 7.032572] [<ffffffff810529e4>] kernel_thread_helper+0x4/0x10
[ 7.038614] [<ffffffff81c4f690>] ? restore_args+0x0/0x30
[ 7.044133] [<ffffffff8239f217>] ? kernel_init+0x0/0x337
[ 7.049652] [<ffffffff810529e0>] ? kernel_thread_helper+0x0/0x10
[ 7.055857] Code: d2 75 ce 48 83 c7 08 48 83 e8 40 49 83 c0 40 48 ff 05 be 59 a5 01 e9 2a ff ff ff 66 0f 1f 84 00 00 00 00 00 48 ff 05 99 59 a5 01 <c9> c3 0f 1f 80 00 00 00 00 49 8d 04 00 48 ff 05 bd 59 a5 01 c9
[ 7.078960] ---[ end trace 4eaa2a86a8e2da22 ]---
[ 7.083696] Kernel panic - not syncing: Non maskable interrupt
[ 7.089643] Pid: 1, comm: swapper Tainted: G D 2.6.37-rc1 #10
[ 7.096283] Call Trace:
[ 7.098850] <NMI> [<ffffffff81c4889e>] panic+0xad/0x260
[ 7.104435] [<ffffffff81c4f17d>] ? _raw_spin_unlock_irqrestore+0x9d/0xb0
[ 7.111338] [<ffffffff81c50e32>] die_nmi+0x182/0x1a0
[ 7.116511] [<ffffffff81c51a4a>] nmi_watchdog_tick+0x1ea/0x290
[ 7.122542] [<ffffffff81c502c0>] do_nmi+0x230/0x450
[ 7.127620] [<ffffffff81c4fbc0>] nmi+0x20/0x39
[ 7.132267] [<ffffffff815854e7>] ? find_next_bit+0x117/0x160
[ 7.138124] <<EOE>> [<ffffffff81584d99>] cpumask_next_and+0x39/0x80
[ 7.144747] [<ffffffff810a9fca>] find_busiest_group+0x24a/0x1200
[ 7.150956] [<ffffffff810b08cf>] load_balance+0xdf/0xa60
[ 7.156474] [<ffffffff81c49b13>] ? schedule+0xdb3/0xee0
[ 7.161899] [<ffffffff81c49c29>] schedule+0xec9/0xee0
[ 7.167151] [<ffffffff81c4a69c>] schedule_timeout+0x30c/0x450
[ 7.173099] [<ffffffff81106a6b>] ? trace_hardirqs_off+0x1b/0x30
[ 7.179224] [<ffffffff810f606d>] ? local_clock+0x9d/0xb0
[ 7.184737] [<ffffffff81c4f0bc>] ? _raw_spin_unlock_irq+0x4c/0x70
[ 7.191032] [<ffffffff81c49fc5>] wait_for_common+0x185/0x220
[ 7.196898] [<ffffffff810b2250>] ? default_wake_function+0x0/0x30
[ 7.203200] [<ffffffff81c4a1b4>] wait_for_completion+0x24/0x30
[ 7.209239] [<ffffffff810eba42>] kthread_create+0xc2/0x160
[ 7.214926] [<ffffffff810e3c40>] ? rescuer_thread+0x0/0x2a0
[ 7.220707] [<ffffffff810a17cf>] ? complete+0x2f/0x80
[ 7.225966] [<ffffffff8110a35b>] ? trace_hardirqs_on+0x1b/0x30
[ 7.231999] [<ffffffff81212b50>] ? kmem_cache_alloc_notrace+0x160/0x1c0
[ 7.238824] [<ffffffff810e37d5>] __alloc_workqueue_key+0x465/0x8d0
[ 7.245209] [<ffffffff823c0a21>] cpuset_init_smp+0x5d/0x82
[ 7.250902] [<ffffffff8239f3fe>] kernel_init+0x1e7/0x337
[ 7.256422] [<ffffffff810529e4>] kernel_thread_helper+0x4/0x10
[ 7.262455] [<ffffffff81c4f690>] ? restore_args+0x0/0x30
[ 7.267974] [<ffffffff8239f217>] ? kernel_init+0x0/0x337
[ 7.273486] [<ffffffff810529e0>] ? kernel_thread_helper+0x0/0x10
[ 8.307196] Rebooting in 10 seconds..
next prev parent reply other threads:[~2010-11-11 10:09 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-11 10:06 [BUG 2.6.27-rc1] find_busiest_group() LOCKUP Wu Fengguang
2010-11-11 10:09 ` Wu Fengguang [this message]
2010-11-11 12:36 ` Peter Zijlstra
2010-11-11 12:40 ` Wu Fengguang
2010-11-11 13:04 ` Peter Zijlstra
2010-11-13 8:40 ` Wu Fengguang
2010-11-13 10:30 ` Peter Zijlstra
2010-11-13 12:00 ` Wu Fengguang
2010-11-13 12:57 ` Peter Zijlstra
2010-11-13 13:10 ` Wu Fengguang
2010-11-13 19:12 ` Yinghai Lu
2010-11-13 19:41 ` Peter Zijlstra
2010-11-13 23:57 ` Wu Fengguang
2010-11-14 0:18 ` Yinghai Lu
2010-11-14 0:55 ` Wu Fengguang
2010-11-14 1:38 ` [PATCH] x86, acpi: Handle all SRAT cpu entries even have cpu num limitation Yinghai Lu
2010-11-14 17:32 ` Wu Fengguang
2010-11-14 18:02 ` Yinghai Lu
2010-11-14 18:19 ` Yinghai Lu
2010-11-15 1:22 ` Wu Fengguang
2010-12-15 22:01 ` H. Peter Anvin
2010-12-15 22:40 ` Yinghai Lu
2010-12-15 22:53 ` H. Peter Anvin
2010-12-15 22:57 ` Yinghai Lu
2010-12-17 3:09 ` [PATCH 1/2] x86, acpi: add MAX_LOCAL_APIC for 32bit Yinghai Lu
2010-12-17 3:09 ` [PATCH -v2 2/2] x86, acpi: Parse all SRAT cpu entries even have cpu num limitation Yinghai Lu
2010-12-17 18:53 ` Venkatesh Pallipadi
2010-12-17 19:27 ` Yinghai Lu
2010-12-17 19:35 ` Kay Sievers
2010-12-17 23:32 ` Venkatesh Pallipadi
2010-12-21 4:31 ` Venkatesh Pallipadi
2010-12-22 6:43 ` David Rientjes
2010-12-22 20:28 ` Venkatesh Pallipadi
2010-12-22 22:51 ` David Rientjes
2010-12-27 18:43 ` H. Peter Anvin
2010-12-23 23:22 ` [tip:x86/apic] x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation tip-bot for Yinghai Lu
2010-12-17 20:56 ` [PATCH 1/2] x86, acpi: add MAX_LOCAL_APIC for 32bit David Rientjes
2010-12-23 23:21 ` [tip:x86/apic] x86, acpi: Add " tip-bot for Yinghai Lu
2010-11-11 12:42 ` [BUG 2.6.27-rc1] find_busiest_group() LOCKUP Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101111100921.GA25587@localhost \
--to=fengguang.wu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox