linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....]
@ 2014-06-16 19:23 Pim van den Berg
  2014-06-16 20:22 ` Vasiliy Tolstov
  2014-09-11  7:35 ` Pavel Goran
  0 siblings, 2 replies; 5+ messages in thread
From: Pim van den Berg @ 2014-06-16 19:23 UTC (permalink / raw)
  To: linux-bcache

Hi,

I'm trying to upgrade from a 3.12.8 kernel to 3.14.6.

Unfortunately my load average doesn't go below 2.00 (as mentioned
earlier on this list). The "bcache: fix uninterruptible sleep in
writeback thread" patch by Slava Pestov doesn't fix that for me.

But more important, afer a while I run into a soft lockup. I've not been
able to run this kernel more than a couple of hours.

I'm running 3.14.6, plus these patches from this mailinglist:
- bcache: fix uninterruptible sleep in writeback thread
- bcache: fix crash on shutdown in passthrough mode

I've also tried running this 3.14.6 kernel plus all bcache related
patches from 3.15. This makes no difference, same behavior.

[37903.477806] BUG: soft lockup - CPU#0 stuck for 23s! [bcache_gc:1842]
[37903.477838] CPU: 0 PID: 1842 Comm: bcache_gc Not tainted 3.14.6-kvm #2
[37903.477861] Hardware name:                  /DH67CF, BIOS
BLH6710H.86A.0156.2012.0615.1908 06/15/2012
[37903.477899] task: ffff88021ebc6dd0 ti: ffff8800d514a000 task.ti:
ffff8800d514a000
[37903.477935] RIP: 0010:[<ffffffff81464557>]  [<ffffffff81464557>]
bch_btree_iter_next+0x250/0x272
[37903.477978] RSP: 0018:ffff8800d514bbd8  EFLAGS: 00000297
[37903.477999] RAX: 0000000000000000 RBX: ffffffff8146a78c RCX:
0000000009000001
[37903.478023] RDX: ffff88000a32ed60 RSI: ffff880214b874c8 RDI:
ffff8800d514bc28
[37903.478047] RBP: ffff8800d514bbe8 R08: ffff880213440000 R09:
ffff8800d50a8000
[37903.478071] R10: 0000000000000800 R11: 0000000000000008 R12:
ffff880214b874c8
[37903.478095] R13: ffff88000a31c778 R14: ffff88000a30ecc0 R15:
0000000000000000
[37903.478120] FS:  0000000000000000(0000) GS:ffff88021fa00000(0000)
knlGS:0000000000000000
[37903.478156] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[37903.478178] CR2: 00007fb61d572000 CR3: 0000000001c0b000 CR4:
00000000000427e0
[37903.478202] Stack:
[37903.478217]  0000000000003894 ffffffff814648c9 ffff8800d514bc18
ffffffff81464595
[37903.478256]  0000000000003894 ffff880214b874c8 ffff8800d514bc28
ffff8800d514bdb0
[37903.478294]  ffff8800d514bc98 ffffffff81464aeb 0000000000000004
0000000000000002
[37903.478333] Call Trace:
[37903.478352]  [<ffffffff814648c9>] ? bch_ptr_invalid+0xc/0xc
[37903.478374]  [<ffffffff81464595>] bch_btree_iter_next_filter+0x1c/0x3d
[37903.478398]  [<ffffffff81464aeb>] btree_gc_count_keys+0x45/0x57
[37903.478422]  [<ffffffff81468f08>] btree_gc_recurse+0xe3/0x2ba
[37903.478445]  [<ffffffff81464595>] ? bch_btree_iter_next_filter+0x1c/0x3d
[37903.478469]  [<ffffffff8146594e>] ? btree_gc_mark_node+0xc1/0x1c1
[37903.478494]  [<ffffffff810b4db8>] ? __wake_up+0x3f/0x48
[37903.478516]  [<ffffffff810b5056>] ? finish_wait+0x5a/0x60
[37903.478538]  [<ffffffff8146930f>] bch_btree_gc+0x230/0x389
[37903.478561]  [<ffffffff810b4c1c>] ? __wake_up_common+0x80/0x80
[37903.478584]  [<ffffffff8146949a>] bch_gc_thread+0x32/0xe0
[37903.478606]  [<ffffffff81469468>] ? bch_btree_gc+0x389/0x389
[37903.478629]  [<ffffffff810a0ae5>] kthread+0xcd/0xd5
[37903.478650]  [<ffffffff810a0a18>] ? __kthread_parkme+0x5c/0x5c
[37903.478674]  [<ffffffff81603f3c>] ret_from_fork+0x7c/0xb0
[37903.478696]  [<ffffffff810a0a18>] ? __kthread_parkme+0x5c/0x5c
[37903.478717] Code: 4a 01 48 ff c0 48 c1 e0 04 48 c1 e1 04 48 01 d8 48
01 d9 4c 8b 08 4c 89 09 48 8b 40 08 48 89 41 08 48 89 d0 4c 89 07 48 89
77 08 <48> 8b 73 08 48 8d 0c 00 48 8d 51 01 48 39 f2 0f 82 29 ff ff ff
[37931.494997] BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:1842]
[37931.495028] CPU: 0 PID: 1842 Comm: bcache_gc Not tainted 3.14.6-kvm #2
[37931.495051] Hardware name:                  /DH67CF, BIOS
BLH6710H.86A.0156.2012.0615.1908 06/15/2012
[37931.495090] task: ffff88021ebc6dd0 ti: ffff8800d514a000 task.ti:
ffff8800d514a000
[37931.495125] RIP: 0010:[<ffffffff8146a856>]  [<ffffffff8146a856>]
bch_extent_bad+0x75/0x15d
[37931.495168] RSP: 0018:ffff8800d514bbc8  EFLAGS: 00000a06
[37931.495189] RAX: 0000000000000001 RBX: ffff880214b874c8 RCX:
0000000009000000
[37931.495214] RDX: 0000000000000001 RSI: 0000000000000001 RDI:
0000000000000001
[37931.495238] RBP: ffff8800d514bbd8 R08: ffff880213440000 R09:
ffff8800d50a8000
[37931.495262] R10: 0000000000000800 R11: 0000000000000008 R12:
ffffffff814686bd
[37931.495286] R13: ffff8800d514bc98 R14: ffff8800cf97b800 R15:
ffff8800d514bdd8
[37931.495311] FS:  0000000000000000(0000) GS:ffff88021fa00000(0000)
knlGS:0000000000000000
[37931.495347] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[37931.495369] CR2: 00007fb61d572000 CR3: 0000000001c0b000 CR4:
00000000000427e0
[37931.495393] Stack:
[37931.495408]  ffff88000a306368 ffffffff814648c9 ffff8800d514bbe8
ffffffff814648d3
[37931.495447]  ffff8800d514bc18 ffffffff814645a6 0000000000000c54
ffff880214b874c8
[37931.495486]  ffff8800d514bc28 ffff8800d514bdb0 ffff8800d514bc98
ffffffff81464aeb
[37931.495524] Call Trace:
[37931.495542]  [<ffffffff814648c9>] ? bch_ptr_invalid+0xc/0xc
[37931.495565]  [<ffffffff814648d3>] bch_ptr_bad+0xa/0xc
[37931.495587]  [<ffffffff814645a6>] bch_btree_iter_next_filter+0x2d/0x3d
[37931.495611]  [<ffffffff81464aeb>] btree_gc_count_keys+0x45/0x57
[37931.495634]  [<ffffffff81468f08>] btree_gc_recurse+0xe3/0x2ba
[37931.495657]  [<ffffffff81464595>] ? bch_btree_iter_next_filter+0x1c/0x3d
[37931.495681]  [<ffffffff8146594e>] ? btree_gc_mark_node+0xc1/0x1c1
[37931.495706]  [<ffffffff810b4db8>] ? __wake_up+0x3f/0x48
[37931.495728]  [<ffffffff810b5056>] ? finish_wait+0x5a/0x60
[37931.495751]  [<ffffffff8146930f>] bch_btree_gc+0x230/0x389
[37931.495773]  [<ffffffff810b4c1c>] ? __wake_up_common+0x80/0x80
[37931.495796]  [<ffffffff8146949a>] bch_gc_thread+0x32/0xe0
[37931.495818]  [<ffffffff81469468>] ? bch_btree_gc+0x389/0x389
[37931.495841]  [<ffffffff810a0ae5>] kthread+0xcd/0xd5
[37931.495863]  [<ffffffff810a0a18>] ? __kthread_parkme+0x5c/0x5c
[37931.495886]  [<ffffffff81603f3c>] ret_from_fork+0x7c/0xb0
[37931.495908]  [<ffffffff810a0a18>] ? __kthread_parkme+0x5c/0x5c
[37931.495930] Code: 00 48 83 f8 07 77 0f 31 ff 49 83 bc c0 40 0c 00 00
00 40 0f 95 c7 85 ff 0f 84 ee 00 00 00 ff c2 89 d0 48 39 f0 72 c5 48 c1
e9 24 <31> c0 80 e1 01 0f 85 d8 00 00 00 49 ba ff ff ff ff ff 07 00 00
[37939.347814] INFO: rcu_sched self-detected stall on CPU { 0}  (t=15001
jiffies g=934378 c=934377 q=13108)
[37939.347864] sending NMI to all CPUs:
[37939.347886] NMI backtrace for cpu 0
[37939.347906] CPU: 0 PID: 1842 Comm: bcache_gc Not tainted 3.14.6-kvm #2
[37939.347929] Hardware name:                  /DH67CF, BIOS
BLH6710H.86A.0156.2012.0615.1908 06/15/2012
[37939.347968] task: ffff88021ebc6dd0 ti: ffff8800d514a000 task.ti:
ffff8800d514a000
[37939.348003] RIP: 0010:[<ffffffff812f281a>]  [<ffffffff812f281a>]
delay_tsc+0x0/0x4b
[37939.348043] RSP: 0018:ffff88021fa03db0  EFLAGS: 00000887
[37939.348065] RAX: 00000000a69f7e00 RBX: 0000000000002710 RCX:
0000000000000007
[37939.348089] RDX: 0000000000274448 RSI: 0000000000000002 RDI:
0000000000274449
[37939.348113] RBP: ffff88021fa03db8 R08: 0000000000000000 R09:
0000000000000000
[37939.348137] R10: ffffffff81863f10 R11: ffff88021e81d400 R12:
ffff88021fa0d330
[37939.348161] R13: 0000000000000000 R14: ffffffff81c25a80 R15:
0000000000000000
[37939.348185] FS:  0000000000000000(0000) GS:ffff88021fa00000(0000)
knlGS:0000000000000000
[37939.348222] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[37939.348244] CR2: 00007fb61d572000 CR3: 0000000001c0b000 CR4:
00000000000427e0
[37939.348267] Stack:
[37939.348283]  ffffffff812f28a6 ffff88021fa03dc8 ffffffff812f28cc
ffff88021fa03de8
[37939.348321]  ffffffff8104f52e ffff88021fa0d848 ffffffff81c25a80
ffff88021fa03e48
[37939.348360]  ffffffff810c33f6 0000000000003334 ffffffff81c76c10
0000000000000000
[37939.348398] Call Trace:
[37939.348415]  <IRQ>
[37939.348419]  [<ffffffff812f28a6>] ? __delay+0xa/0xc
[37939.348454]  [<ffffffff812f28cc>] __const_udelay+0x24/0x26
[37939.348479]  [<ffffffff8104f52e>]
arch_trigger_all_cpu_backtrace+0x65/0x6f
[37939.348505]  [<ffffffff810c33f6>] rcu_check_callbacks+0x1cc/0x4ed
[37939.348530]  [<ffffffff810ac259>] ? account_system_time+0x104/0x14c
[37939.348554]  [<ffffffff81092f1a>] update_process_times+0x3a/0x63
[37939.348578]  [<ffffffff810caf3d>] tick_sched_handle+0x45/0x4a
[37939.349917]  [<ffffffff810cb0e7>] tick_sched_timer+0x37/0x56
[37939.349940]  [<ffffffff810a2d38>] __run_hrtimer.isra.24+0x71/0xca
[37939.349964]  [<ffffffff810a34a0>] hrtimer_interrupt+0xe8/0x1d7
[37939.349987]  [<ffffffff8104e1da>] local_apic_timer_interrupt+0x50/0x54
[37939.350012]  [<ffffffff8104e542>] smp_apic_timer_interrupt+0x3c/0x4f
[37939.350037]  [<ffffffff81604b0a>] apic_timer_interrupt+0x6a/0x70
[37939.350058]  <EOI>
[37939.350063]  [<ffffffff8146a134>] ? bch_debug_exit+0x23/0x23
[37939.350101]  [<ffffffff8146a78c>] ? bch_extent_invalid+0x31/0x86
[37939.350124]  [<ffffffff8146a7fe>] bch_extent_bad+0x1d/0x15d
[37939.350147]  [<ffffffff814648c9>] ? bch_ptr_invalid+0xc/0xc
[37939.350169]  [<ffffffff814648d3>] bch_ptr_bad+0xa/0xc
[37939.350191]  [<ffffffff814645a6>] bch_btree_iter_next_filter+0x2d/0x3d
[37939.350215]  [<ffffffff81464aeb>] btree_gc_count_keys+0x45/0x57
[37939.350238]  [<ffffffff81468f08>] btree_gc_recurse+0xe3/0x2ba
[37939.350261]  [<ffffffff81464595>] ? bch_btree_iter_next_filter+0x1c/0x3d
[37939.350285]  [<ffffffff8146594e>] ? btree_gc_mark_node+0xc1/0x1c1
[37939.350309]  [<ffffffff810b4db8>] ? __wake_up+0x3f/0x48
[37939.350331]  [<ffffffff8146930f>] bch_btree_gc+0x230/0x389
[37939.350354]  [<ffffffff810b4c1c>] ? __wake_up_common+0x80/0x80
[37939.350377]  [<ffffffff8146949a>] bch_gc_thread+0x32/0xe0
[37939.350399]  [<ffffffff81469468>] ? bch_btree_gc+0x389/0x389
[37939.350421]  [<ffffffff810a0ae5>] kthread+0xcd/0xd5
[37939.350442]  [<ffffffff810a0a18>] ? __kthread_parkme+0x5c/0x5c
[37939.350465]  [<ffffffff81603f3c>] ret_from_fork+0x7c/0xb0
[37939.350487]  [<ffffffff810a0a18>] ? __kthread_parkme+0x5c/0x5c
[37939.350508] Code: 90 55 48 89 f8 48 89 e5 48 85 c0 74 19 eb 02 66 90
eb 0e 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 ff c8 75 fb 48 ff c8
5d c3 <55> 48 89 e5 65 8b 34 25 1c b0 00 00 0f 1f 00 0f ae e8 0f 31 89
[37939.350623] NMI backtrace for cpu 1
[37939.350625] INFO: NMI handler
(arch_trigger_all_cpu_backtrace_handler) took too long to run: 2.736 msecs
[37939.350685] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.6-kvm #2
[37939.350708] Hardware name:                  /DH67CF, BIOS
BLH6710H.86A.0156.2012.0615.1908 06/15/2012
[37939.350746] task: ffff88021e900950 ti: ffff88021e904000 task.ti:
ffff88021e904000
[37939.350782] RIP: 0010:[<ffffffff8131cfde>]  [<ffffffff8131cfde>]
intel_idle+0xbd/0x10b
[37939.350822] RSP: 0018:ffff88021e905e28  EFLAGS: 00000046
[37939.350843] RAX: 0000000000000001 RBX: 0000000000000002 RCX:
0000000000000001
[37939.350867] RDX: 0000000000000000 RSI: ffff88021e905fd8 RDI:
0000000000000001
[37939.350891] RBP: ffff88021e905e58 R08: 0000000000000009 R09:
000000000000030d
[37939.350915] R10: 0000000000000006 R11: 0000000000000400 R12:
0000000000000002
[37939.350939] R13: 0000000000000001 R14: 0000000000000001 R15:
0000000000000000
[37939.350963] FS:  0000000000000000(0000) GS:ffff88021fb00000(0000)
knlGS:0000000000000000
[37939.351000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[37939.351022] CR2: 00007fff5b76bfe8 CR3: 00000000cff86000 CR4:
00000000000427e0
[37939.351045] Stack:
[37939.351060]  ffff88021e905e58 00000001810c42e5 ffff88021fb15d00
ffffffff81c3c188
[37939.351099]  0000227bfbc05f82 ffff88021e905f00 ffff88021e905eb8
ffffffff81498f06
[37939.351137]  0000000000000002 ffffffff81c3c0c0 0000000000000000
00000000001ef3a2
[37939.351176] Call Trace:
[37939.351195]  [<ffffffff81498f06>] cpuidle_enter_state+0x3a/0xac
[37939.351218]  [<ffffffff81499040>] cpuidle_idle_call+0xc8/0x111
[37939.351243]  [<ffffffff810354c8>] arch_cpu_idle+0x9/0x18
[37939.351265]  [<ffffffff810bcdd2>] cpu_startup_entry+0xae/0x118
[37939.351289]  [<ffffffff8104d212>] start_secondary+0x1b2/0x1b7
[37939.351310] Code: 31 d2 65 48 8b 34 25 a0 b7 00 00 48 8d 86 38 e0 ff
ff 48 89 d1 0f 01 c8 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f
01 c9 <65> 48 8b 0c 25 a0 b7 00 00 83 a1 3c e0 ff ff fb 0f ae f0 48 8b

-- 
Regards,
Pim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....]
  2014-06-16 19:23 BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....] Pim van den Berg
@ 2014-06-16 20:22 ` Vasiliy Tolstov
  2014-06-17 18:53   ` Peter Kieser
  2014-09-11  7:35 ` Pavel Goran
  1 sibling, 1 reply; 5+ messages in thread
From: Vasiliy Tolstov @ 2014-06-16 20:22 UTC (permalink / raw)
  To: Pim van den Berg; +Cc: linux-bcache

2014-06-16 23:23 GMT+04:00 Pim van den Berg <pim.vandenberg@nethuis.nl>:
> I'm trying to upgrade from a 3.12.8 kernel to 3.14.6.


Why you moving from 3.12 lts to 3.14 ?

-- 
Vasiliy Tolstov,
e-mail: v.tolstov@selfip.ru
jabber: vase@selfip.ru

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....]
  2014-06-16 20:22 ` Vasiliy Tolstov
@ 2014-06-17 18:53   ` Peter Kieser
  2014-06-17 19:46     ` Vasiliy Tolstov
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Kieser @ 2014-06-17 18:53 UTC (permalink / raw)
  To: Vasiliy Tolstov, Pim van den Berg; +Cc: linux-bcache

[-- Attachment #1: Type: text/plain, Size: 326 bytes --]

On 2014-06-16 1:22 PM, Vasiliy Tolstov wrote:
> 2014-06-16 23:23 GMT+04:00 Pim van den Berg <pim.vandenberg@nethuis.nl>:
>> I'm trying to upgrade from a 3.12.8 kernel to 3.14.6.
>
> Why you moving from 3.12 lts to 3.14 ?
>
FYI, Kent Overstreet (maintainer of bcache) recommends 3.14 to fix 
bcache issues.

-Ptter


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4504 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....]
  2014-06-17 18:53   ` Peter Kieser
@ 2014-06-17 19:46     ` Vasiliy Tolstov
  0 siblings, 0 replies; 5+ messages in thread
From: Vasiliy Tolstov @ 2014-06-17 19:46 UTC (permalink / raw)
  To: Peter Kieser; +Cc: Pim van den Berg, linux-bcache

2014-06-17 22:53 GMT+04:00 Peter Kieser <peter@kieser.ca>:
> FYI, Kent Overstreet (maintainer of bcache) recommends 3.14 to fix bcache
> issues.


Hm. Why not backport patches for regressions to all stable lts
kernels? (3.10, 3.12)

-- 
Vasiliy Tolstov,
e-mail: v.tolstov@selfip.ru
jabber: vase@selfip.ru

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....]
  2014-06-16 19:23 BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....] Pim van den Berg
  2014-06-16 20:22 ` Vasiliy Tolstov
@ 2014-09-11  7:35 ` Pavel Goran
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Goran @ 2014-09-11  7:35 UTC (permalink / raw)
  To: linux-bcache

Hello Pim,

Tuesday, June 17, 2014, 2:23:00 AM, you wrote:

> But more important, afer a while I run into a soft lockup. I've not been
> able to run this kernel more than a couple of hours.

> I'm running 3.14.6, plus these patches from this mailinglist:
> - bcache: fix uninterruptible sleep in writeback thread
> - bcache: fix crash on shutdown in passthrough mode

> I've also tried running this 3.14.6 kernel plus all bcache related
> patches from 3.15. This makes no difference, same behavior.

> [37903.477806] BUG: soft lockup - CPU#0 stuck for 23s! [bcache_gc:1842]

Seems like I was just hit by this bug.

I'm  running pf-kernel (http://pf.natalenko.name) version 3.16-pf1 (it's based
on  vanilla  3.16.1,  as  far as I remember) with the same two patches (fixing
uninterruptible   sleep   and   fixing  crash  on  shutdown).  Cache  mode  is
"writearound",  CPU  scheduler  is  BFS  (from  -ck  patchset  by Con Kolivas,
included in pf-kernel).

The following happened: I was doing some not-so-heavy IO on a bcached disk (an
update of portage snapshot by means of emerge-delta-webrsync), and my terminal
application stopped responding. X was still running (I could move windows, and
the load indicators kept updating). However, everything stopped in a minute or
so.  I  waited  for several minutes and then I had to shut down the laptop the
hard way.

Kernel  messages  were logged by syslog this time (I had several hangs before,
but I couldn't see anything in logs). Here is the summary:

Sep 11 13:19:23 aurora kernel: BUG: soft lockup - CPU#1 stuck for 22s! [bcache_gc:973]
Sep 11 13:19:51 aurora kernel: BUG: soft lockup - CPU#1 stuck for 23s! [bcache_gc:973]
Sep 11 13:19:57 aurora kernel: INFO: rcu_sched self-detected stall on CPU { 1}  (t=18000 jiffies g=24035340 c=24035339 q=40933)
Sep 11 13:20:27 aurora kernel: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:973]
Sep 11 13:20:55 aurora kernel: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:973]
Sep 11 13:21:01 aurora kernel: INFO: rcu_sched self-detected stall on CPU { 0}  (t=18000 jiffies g=24035358 c=24035357 q=17722)
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16118 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16831 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16832 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16833 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16844 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16848 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16855 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16846 blocked for more than 120 seconds.
Sep 11 13:21:19 aurora kernel: INFO: task qemu-system-x86:16847 blocked for more than 120 seconds.
Sep 11 13:21:27 aurora kernel: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:973]
Sep 11 13:21:55 aurora kernel: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:973]
Sep 11 13:22:23 aurora kernel: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:973]
Sep 11 13:22:51 aurora kernel: BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:973]
(shutdown)

Detailed  logs  for the first "soft lockup" event and the first "self-detected
stall" event:

Sep 11 13:19:23 aurora kernel: BUG: soft lockup - CPU#1 stuck for 22s! [bcache_gc:973]
Sep 11 13:19:23 aurora kernel: Modules linked in: ctr ccm ipt_REJECT xt_TCPMSS xt_limit xt_multiport xt_conntrack iptabl
e_filter ipt_MASQUERADE xt_nat iptable_nat bnep nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tabl
es btusb usbhid uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core b43 bcma mac80211 ssb_hcd dell_laptop dcdbas 
coretemp kvm_intel joydev pcspkr kvm b44 wl(PO) i2c_i801 microcode cfg80211 mii uhci_hcd libphy sdhci_pci sdhci intel_ag
p ehci_pci intel_gtt ehci_hcd snd_hda_codec_idt snd_hda_codec_generic ssb mmc_core pcmcia pcmcia_core snd_hda_intel snd_
hda_controller snd_hda_codec snd_hwdep acpi_cpufreq
Sep 11 13:19:23 aurora kernel: CPU: 1 PID: 973 Comm: bcache_gc Tainted: P           O  3.16.0-pf1 #1
Sep 11 13:19:23 aurora kernel: Hardware name: Dell Inc. Inspiron 1720                   /0UK437, BIOS A09 07/11/2008
Sep 11 13:19:23 aurora kernel: task: ffff8800d96a96c0 ti: ffff880194548000 task.ti: ffff880194548000
Sep 11 13:19:23 aurora kernel: RIP: 0010:[<ffffffff817476af>]  [<ffffffff817476af>] __bch_btree_iter_next+0x1bb/0x214
Sep 11 13:19:23 aurora kernel: RSP: 0018:ffff88019454bb98  EFLAGS: 00000202
Sep 11 13:19:23 aurora kernel: RAX: 0000000000000101 RBX: ffff880195c14400 RCX: 0000000000000000
Sep 11 13:19:23 aurora kernel: RDX: ffff88000c19b2c0 RSI: ffff88000c1a0da0 RDI: ffff88000c1a0d40
Sep 11 13:19:23 aurora kernel: RBP: ffff88019454bbd8 R08: 0000000000000001 R09: 0000000000000400
Sep 11 13:19:23 aurora kernel: R10: 0000000000000400 R11: 0000000000000008 R12: ffff88019454bcc8
Sep 11 13:19:23 aurora kernel: R13: ffff880195c14400 R14: 0000000000000000 R15: ffffffff8174ef5d
Sep 11 13:19:23 aurora kernel: FS:  0000000000000000(0000) GS:ffff88019fd00000(0000) knlGS:0000000000000000
Sep 11 13:19:23 aurora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 11 13:19:23 aurora kernel: CR2: 00007f260f06cea8 CR3: 000000008edca000 CR4: 00000000000027f0
Sep 11 13:19:23 aurora kernel: Stack:
Sep 11 13:19:23 aurora kernel: ffff88019454bbd8 ffff88000c19b2a8 0000000000000400 000000000000364e
Sep 11 13:19:23 aurora kernel: ffffffff817492b4 ffff8800da1a30d8 ffff88019454bc28 0000000000000001
Sep 11 13:19:23 aurora kernel: ffff88019454bbe8 ffffffff81747718 ffff88019454bc18 ffffffff8174910d
Sep 11 13:19:23 aurora kernel: Call Trace:
Sep 11 13:19:23 aurora kernel: [<ffffffff817492b4>] ? bch_ptr_invalid+0xc/0xc
Sep 11 13:19:23 aurora kernel: [<ffffffff81747718>] bch_btree_iter_next+0x10/0x12
Sep 11 13:19:23 aurora kernel: [<ffffffff8174910d>] bch_btree_iter_next_filter+0x1c/0x3d
Sep 11 13:19:23 aurora kernel: [<ffffffff8174944e>] btree_gc_count_keys+0x45/0x57
Sep 11 13:19:23 aurora kernel: [<ffffffff8174d9be>] btree_gc_recurse+0xf2/0x2ab
Sep 11 13:19:23 aurora kernel: [<ffffffff81749e1b>] ? btree_gc_mark_node+0xb8/0x1a4
Sep 11 13:19:23 aurora kernel: [<ffffffff81071cb2>] ? spin_unlock_irqrestore+0x9/0xb
Sep 11 13:19:23 aurora kernel: [<ffffffff81749c58>] ? __bch_btree_mark_key+0xa9/0x1b4
Sep 11 13:19:23 aurora kernel: [<ffffffff8174ddb3>] bch_btree_gc+0x23c/0x39a
Sep 11 13:19:23 aurora kernel: [<ffffffff81072088>] ? abort_exclusive_wait+0x8a/0x8a
Sep 11 13:19:23 aurora kernel: [<ffffffff8174df43>] bch_gc_thread+0x32/0xe9
Sep 11 13:19:23 aurora kernel: [<ffffffff8174df11>] ? bch_btree_gc+0x39a/0x39a
Sep 11 13:19:23 aurora kernel: [<ffffffff8106593d>] kthread+0xa0/0xa8
Sep 11 13:19:23 aurora kernel: [<ffffffff8106589d>] ? __kthread_parkme+0x5c/0x5c
Sep 11 13:19:23 aurora kernel: [<ffffffff818ce67c>] ret_from_fork+0x7c/0xb0
Sep 11 13:19:23 aurora kernel: [<ffffffff8106589d>] ? __kthread_parkme+0x5c/0x5c
Sep 11 13:19:23 aurora kernel: Code: 4d 8d 6f 01 49 ff c4 49 c1 e4 04 49 c1 e5 04 49 01 dc 49 01 dd 49 8b 14 24 49 8b 4c 24 08 49 8b 7d 00 49 8b 75 08 41 ff d6 84 c0 <75> 44 49 8b 0c 24 49 8b 55 00 49 8b 45 08 49 89 4d 00 49 8b 4c 

And:

Sep 11 13:19:57 aurora kernel: INFO: rcu_sched self-detected stall on CPU { 1}  (t=18000 jiffies g=24035340 c=24035339 q
=40933)
Sep 11 13:19:57 aurora kernel: sending NMI to all CPUs:
Sep 11 13:19:57 aurora kernel: NMI backtrace for cpu 1
Sep 11 13:19:57 aurora kernel: CPU: 1 PID: 973 Comm: bcache_gc Tainted: P           O  3.16.0-pf1 #1
Sep 11 13:19:57 aurora kernel: Hardware name: Dell Inc. Inspiron 1720                   /0UK437, BIOS A09 07/11/2008
Sep 11 13:19:57 aurora kernel: task: ffff8800d96a96c0 ti: ffff880194548000 task.ti: ffff880194548000
Sep 11 13:19:57 aurora kernel: RIP: 0010:[<ffffffff8146d957>]  [<ffffffff8146d957>] delay_tsc+0x2a/0x6a
Sep 11 13:19:57 aurora kernel: RSP: 0018:ffff88019fd03d78  EFLAGS: 00000002
Sep 11 13:19:57 aurora kernel: RAX: 0002d8eddf7c6464 RBX: 0000000000260ae7 RCX: 0000000000000046
Sep 11 13:19:57 aurora kernel: RDX: 0002d8eddf7c6464 RSI: 0000000000000c00 RDI: 0000000000260ae7
Sep 11 13:19:57 aurora kernel: RBP: ffff88019fd03d98 R08: 0000000000000400 R09: 0000000000000000
Sep 11 13:19:57 aurora kernel: R10: 0000000000000000 R11: 0000000000000006 R12: 00000000df7c6464
Sep 11 13:19:57 aurora kernel: R13: 0000000000000001 R14: ffffffff820341c0 R15: 0000000000000001
Sep 11 13:19:57 aurora kernel: FS:  0000000000000000(0000) GS:ffff88019fd00000(0000) knlGS:0000000000000000
Sep 11 13:19:57 aurora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 11 13:19:57 aurora kernel: CR2: 00007f260f06cea8 CR3: 000000008edca000 CR4: 00000000000027f0
Sep 11 13:19:57 aurora kernel: Stack:
Sep 11 13:19:57 aurora kernel: 0000000000002710 ffff88019fd0e350 0000000000000001 ffffffff820341c0
Sep 11 13:19:57 aurora kernel: ffff88019fd03da8 ffffffff8146d8e4 ffff88019fd03db8 ffffffff8146d90a
Sep 11 13:19:57 aurora kernel: ffff88019fd03dd8 ffffffff81036210 ffff88019fd0fa30 ffffffff820341c0
Sep 11 13:19:57 aurora kernel: Call Trace:
Sep 11 13:19:57 aurora kernel: <IRQ> 
Sep 11 13:19:57 aurora kernel: [<ffffffff8146d8e4>] __delay+0xa/0xc
Sep 11 13:19:57 aurora kernel: [<ffffffff8146d90a>] __const_udelay+0x24/0x26
Sep 11 13:19:57 aurora kernel: [<ffffffff81036210>] arch_trigger_all_cpu_backtrace+0xa0/0xb0
Sep 11 13:19:57 aurora kernel: [<ffffffff81096b17>] rcu_check_callbacks+0x1d7/0x509
Sep 11 13:19:57 aurora kernel: [<ffffffff8104f79e>] ? raise_softirq+0x29/0x30
Sep 11 13:19:57 aurora kernel: [<ffffffff8109ebd6>] ? tick_sched_do_timer+0x2a/0x2a
Sep 11 13:19:57 aurora kernel: [<ffffffff81056256>] update_process_times+0x3a/0x63
Sep 11 13:19:57 aurora kernel: [<ffffffff8109e8d9>] tick_sched_handle+0x45/0x4a
Sep 11 13:19:57 aurora kernel: [<ffffffff8109ec0d>] tick_sched_timer+0x37/0x57
Sep 11 13:19:57 aurora kernel: [<ffffffff81067b31>] __run_hrtimer+0xa9/0x14c
Sep 11 13:19:57 aurora kernel: [<ffffffff81068410>] hrtimer_interrupt+0xbc/0x1a5
Sep 11 13:19:57 aurora kernel: [<ffffffff81034d4a>] local_apic_timer_interrupt+0x51/0x56
Sep 11 13:19:57 aurora kernel: [<ffffffff8103520f>] smp_apic_timer_interrupt+0x2d/0x40
Sep 11 13:19:57 aurora kernel: [<ffffffff818cf59d>] apic_timer_interrupt+0x6d/0x80
Sep 11 13:19:57 aurora kernel: <EOI> 
Sep 11 13:19:57 aurora kernel: [<ffffffff8174f40a>] ? ptr_stale+0x16/0x4d
Sep 11 13:19:57 aurora kernel: [<ffffffff8174f880>] ? bch_extent_bad+0xa1/0x12d
Sep 11 13:19:57 aurora kernel: [<ffffffff817492b4>] ? bch_ptr_invalid+0xc/0xc
Sep 11 13:19:57 aurora kernel: [<ffffffff817492be>] bch_ptr_bad+0xa/0xc
Sep 11 13:19:57 aurora kernel: [<ffffffff8174911e>] bch_btree_iter_next_filter+0x2d/0x3d
Sep 11 13:19:57 aurora kernel: [<ffffffff8174944e>] btree_gc_count_keys+0x45/0x57
Sep 11 13:19:57 aurora kernel: [<ffffffff8174d9be>] btree_gc_recurse+0xf2/0x2ab
Sep 11 13:19:57 aurora kernel: [<ffffffff81749e1b>] ? btree_gc_mark_node+0xb8/0x1a4
Sep 11 13:19:57 aurora kernel: [<ffffffff81071cb2>] ? spin_unlock_irqrestore+0x9/0xb
Sep 11 13:19:57 aurora kernel: [<ffffffff81749c58>] ? __bch_btree_mark_key+0xa9/0x1b4
Sep 11 13:19:57 aurora kernel: [<ffffffff8174ddb3>] bch_btree_gc+0x23c/0x39a
Sep 11 13:19:57 aurora kernel: [<ffffffff81072088>] ? abort_exclusive_wait+0x8a/0x8a
Sep 11 13:19:57 aurora kernel: [<ffffffff8174df43>] bch_gc_thread+0x32/0xe9
Sep 11 13:19:57 aurora kernel: [<ffffffff8174df11>] ? bch_btree_gc+0x39a/0x39a
Sep 11 13:19:57 aurora kernel: [<ffffffff8106593d>] kthread+0xa0/0xa8
Sep 11 13:19:57 aurora kernel: [<ffffffff8106589d>] ? __kthread_parkme+0x5c/0x5c
Sep 11 13:19:57 aurora kernel: [<ffffffff818ce67c>] ret_from_fork+0x7c/0xb0
Sep 11 13:19:57 aurora kernel: [<ffffffff8106589d>] ? __kthread_parkme+0x5c/0x5c
Sep 11 13:19:57 aurora kernel: Code: c3 55 48 89 e5 41 56 41 55 41 54 53 89 fb 65 44 8b 2c 25 2c b0 00 00 0f 1f 00 0f ae e8 e8 52 ff ff ff 41 89 c4 0f 1f 00 0f ae e8 <e8> 44 ff ff ff 89 c2 44 29 e2 39 da 73 29 f3 90 65 44 8b 34 25 
Sep 11 13:19:57 aurora kernel: NMI backtrace for cpu 0
Sep 11 13:19:57 aurora kernel: CPU: 0 PID: 9071 Comm: qemu-system-x86 Tainted: P           O  3.16.0-pf1 #1
Sep 11 13:19:57 aurora kernel: Hardware name: Dell Inc. Inspiron 1720                   /0UK437, BIOS A09 07/11/2008
Sep 11 13:19:57 aurora kernel: task: ffff8800066cad80 ti: ffff880101410000 task.ti: ffff880101410000
Sep 11 13:19:57 aurora kernel: RIP: 0033:[<00007fff4fc04d46>]  [<00007fff4fc04d46>] 0x7fff4fc04d46
Sep 11 13:19:57 aurora kernel: RSP: 002b:00007fff4fbffd00  EFLAGS: 00000246
Sep 11 13:19:57 aurora kernel: RAX: 0000000049f1aefd RBX: 0000000000000001 RCX: 000000000000236f
Sep 11 13:19:57 aurora kernel: RDX: 0000000000000002 RSI: 00007fff4fbffd58 RDI: 0000000000000001
Sep 11 13:19:57 aurora kernel: RBP: 00007fff4fbffd30 R08: 00007fba77f40e08 R09: 0000000000000000
Sep 11 13:19:57 aurora kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 000000000efb72b2
Sep 11 13:19:57 aurora kernel: R13: 001dc865bef71050 R14: 00007fba77f3fd34 R15: 0000000000000000
Sep 11 13:19:57 aurora kernel: FS:  00007fba75d8f900(0000) GS:ffff88019fc00000(0000) knlGS:0000000000000000
Sep 11 13:19:57 aurora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 11 13:19:57 aurora kernel: CR2: 0000000000d10000 CR3: 0000000103ba1000 CR4: 00000000000027f0

Pavel Goran
  

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-09-11  7:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-16 19:23 BUG: soft lockup - CPU#0 stuck for 22s! [bcache_gc:....] Pim van den Berg
2014-06-16 20:22 ` Vasiliy Tolstov
2014-06-17 18:53   ` Peter Kieser
2014-06-17 19:46     ` Vasiliy Tolstov
2014-09-11  7:35 ` Pavel Goran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).