From: Andrew Morton <akpm@linux-foundation.org>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
Jens Axboe <jens.axboe@oracle.com>,
linux-scsi@vger.kernel.org
Subject: Re: 2.6.29-rc3: BUG: NMI Watchdog detected LOCKUP
Date: Thu, 12 Feb 2009 16:19:08 -0800 [thread overview]
Message-ID: <20090212161908.2cc2045c.akpm@linux-foundation.org> (raw)
In-Reply-To: <19f34abd0902080221w662635f0h51875a125b156535@mail.gmail.com>
On Sun, 8 Feb 2009 11:21:20 +0100
Vegard Nossum <vegard.nossum@gmail.com> wrote:
> Hi,
>
> Not sure exactly what happened here. Was running LTP, and it seems
> that the USB flash disk (which held the root device, though I was
> running LTP in a chroot on a fixed harddisk) disconnect, although I
> didn't touch it.
>
> [ 3344.890073] usb 1-6: unregistering interface 1-6:1.0
> [ 3344.895744] sd 2:0:0:0: Device offlined - not ready after error recovery
> [ 3344.902893] sd 2:0:0:0: [sdb] Unhandled error code
> [ 3344.908051] sd 2:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK
> [ 3344.916810] end_request: I/O error, dev sdb, sector 1735619
> [ 3344.922746] Write-error on swap-device (8:16:1735627)
> [ 3344.928195] Write-error on swap-device (8:16:1735635)
> [ 3344.933611] Write-error on swap-device (8:16:1735643)
> [ 3344.939020] Write-error on swap-device (8:16:1735651)
> [ 3344.944427] Write-error on swap-device (8:16:1735659)
> [ 3344.949836] Write-error on swap-device (8:16:1735667)
> [ 3344.955320] Write-error on swap-device (8:16:1735675)
> [ 3344.960757] sd 2:0:0:0: rejecting I/O to offline device
> [ 3344.961735] sd 2:0:0:0: rejecting I/O to offline device
Presumably the device layer (USB or scsi) shat itself. Bad hardware or
a kernel bug?
> [ 3344.972984] BUG: NMI Watchdog detected LOCKUP on CPU1, ip ffffffff81491f02, :
> [ 3344.972984] CPU 1
> [ 3344.972984] Modules linked in:
> [ 3344.972984] Pid: 11127, comm: hackbench Not tainted 2.6.29-rc3 #219
> [ 3344.972984] RIP: 0010:[<ffffffff81491f02>] [<ffffffff81491f02>] _spin_lock_b
> [ 3344.972984] RSP: 0018:ffff880006b01408 EFLAGS: 00000093
> [ 3344.972984] RAX: 0000000000003b39 RBX: 0000000000000001 RCX: 6db6db6db6db6db7
> [ 3344.972984] RDX: ffff88003ec688d8 RSI: ffff880006b01428 RDI: ffff88003ec68b40
> [ 3344.972984] RBP: ffff880006b01408 R08: b000000000000000 R09: 0000000000000000
> [ 3344.972984] R10: ffff880006b01918 R11: 0000000000000000 R12: ffff88003ec688d8
> [ 3344.972984] R13: 0000000000001000 R14: 00000000001aeeb3 R15: ffff88003ec688d8
> [ 3344.972984] FS: 0000000000000000(0000) GS:ffff88003f801a80(0063) knlGS:00000
> [ 3344.972984] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
> [ 3344.972984] CR2: 0000000000b9dea0 CR3: 0000000006ae3000 CR4: 00000000000006a0
> [ 3344.972984] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 3344.972984] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 3344.972984] Process hackbench (pid: 11127, threadinfo ffff880006b00000, task)
> [ 3344.972984] Stack:
> [ 3344.972984] ffff880006b01468 ffffffff8118d26a ffff88001f7e8000 0000000000001
> [ 3344.972984] ffff88001bc33500 0001121000000010 0000000000000047 ffff88001bc30
> [ 3344.972984] ffff88001bc33500 ffff88003ec688d8 00000000001aeeb3 ffff88003ec68
> [ 3344.972984] Call Trace:
> [ 3344.972984] [<ffffffff8118d26a>] __make_request+0x3e/0x412
> [ 3344.972984] [<ffffffff8118bf77>] generic_make_request+0x279/0x2c3
> [ 3344.972984] [<ffffffff8119f189>] ? radix_tree_tag_set+0x6b/0xce
> [ 3344.972984] [<ffffffff8118c087>] submit_bio+0xc6/0xcf
> [ 3344.972984] [<ffffffff8107feb8>] ? unlock_page+0x22/0x26
> [ 3344.972984] [<ffffffff8109ebd4>] swap_writepage+0xa2/0xac
> [ 3344.972984] [<ffffffff8108a076>] shrink_page_list+0x3a7/0x67b
> [ 3344.972984] [<ffffffff810376f1>] ? finish_task_switch+0x68/0x88
> [ 3344.972984] [<ffffffff8101b822>] ? __cpus_empty+0x9/0xb
> [ 3344.972984] [<ffffffff8101ba27>] ? flush_tlb_page+0x66/0x83
> [ 3344.972984] [<ffffffff814908b3>] ? thread_return+0x3d/0xc6
> [ 3344.972984] [<ffffffff8108a98d>] shrink_list+0x29d/0x59f
> [ 3344.972984] [<ffffffff81086c4f>] ? get_dirty_limits+0x22/0x24a
> [ 3344.972984] [<ffffffff8108af10>] shrink_zone+0x281/0x32b
> [ 3344.972984] [<ffffffff8119ff8e>] ? __up_read+0x92/0x9c
> [ 3344.972984] [<ffffffff8108b100>] ? shrink_slab+0x146/0x158
> [ 3344.972984] [<ffffffff8108c022>] try_to_free_pages+0x23d/0x38f
> [ 3344.972984] [<ffffffff81089185>] ? isolate_pages_global+0x0/0x219
> [ 3344.972984] [<ffffffff81085cc9>] __alloc_pages_internal+0x292/0x43d
> [ 3344.972984] [<ffffffff810a6963>] alloc_pages_current+0xb9/0xc2
> [ 3344.972984] [<ffffffff810aa658>] alloc_slab_page+0x19/0x69
> [ 3344.972984] [<ffffffff810aa6f1>] new_slab+0x49/0x1cc
> [ 3344.972984] [<ffffffff8119f8b1>] ? rb_insert_color+0xbd/0xe6
> [ 3344.972984] [<ffffffff810aaad3>] __slab_alloc+0x1f3/0x36c
> [ 3344.972984] [<ffffffff81389fe8>] ? __alloc_skb+0x42/0x130
> [ 3344.972984] [<ffffffff81389fe8>] ? __alloc_skb+0x42/0x130
> [ 3344.972984] [<ffffffff810aaf7c>] kmem_cache_alloc_node+0x69/0xa2
> [ 3344.972984] [<ffffffff81389fe8>] __alloc_skb+0x42/0x130
> [ 3344.972984] [<ffffffff81385bd3>] sock_alloc_send_skb+0xa1/0x200
> [ 3344.972984] [<ffffffff8116700a>] ? security_socket_getpeersec_dgram+0x11/0x3
> [ 3344.972984] [<ffffffff81409250>] unix_stream_sendmsg+0x138/0x2b5
> [ 3344.972984] [<ffffffff8138276b>] __sock_sendmsg+0x59/0x62
> [ 3344.972984] [<ffffffff8138285c>] sock_aio_write+0xe8/0xf8
> [ 3344.972984] [<ffffffff810af9a2>] do_sync_write+0xe7/0x12d
> [ 3344.972984] [<ffffffff8104d980>] ? autoremove_wake_function+0x0/0x38
> [ 3344.972984] [<ffffffff8116d9da>] ? selinux_file_permission+0xbd/0xc6
> [ 3344.972984] [<ffffffff811669d0>] ? security_file_permission+0x11/0x13
> [ 3344.972984] [<ffffffff810b029a>] vfs_write+0xbe/0x105
> [ 3344.972984] [<ffffffff810b03a5>] sys_write+0x47/0x6f
> [ 3344.972984] [<ffffffff8102bba8>] sysenter_dispatch+0x7/0x27
> [ 3344.972984] Code: 01 00 00 f0 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c9 c
> [ 3344.972984] BUG: NMI Watchdog detected LOCKUP<4>---[ end trace 820f38a7b2441-
> [ 3344.972984] on CPU0, ip ffffffff81491f6c, registers:
And then the block layer died. Looks like it was trying to take the
queue lock. Probably against the recently-offlined device.
I'd say that either someone forgot to release the lock on an error
path. Or the structure was freed, but the kernel still tries to use it.
next prev parent reply other threads:[~2009-02-13 0:19 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-08 10:21 2.6.29-rc3: BUG: NMI Watchdog detected LOCKUP Vegard Nossum
2009-02-13 0:19 ` Andrew Morton [this message]
2009-02-13 8:31 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090212161908.2cc2045c.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.