public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Rob Mueller" <robm@fastmail.fm>
To: "Chris Mason" <mason@suse.com>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: Processes stuck in unkillable D state (now seen in 2.6.7-mm6)
Date: Mon, 12 Jul 2004 12:53:44 -0700	[thread overview]
Message-ID: <009e01c46849$f2e85430$9aafc742@ROBMHP> (raw)
In-Reply-To: 1089377936.3956.148.camel@watt.suse.com


> Things will be much easier for you if you configure a serial or network
> console.

> It's just crud on the stack, you're really waiting in io_schedule() for
> a page to get unlocked.  Why isn't the page unlocking?  Hard to say for
> sure without seeing the whole sysrq-t.  If the network/serial console
> doesn't work out, I can help you configure lkcd as well.

Well, I tried compiling in the network console, but it seems to be way too 
buggy. Basically the machine would crash (hard lockup) within about 12-24 
hours after booting, nothing on the network console itself or in any log 
file. Not much help there.

Anyway, after rebooting back into a non-netconsole enabled kernel, we did 
get another stuck process. This time there was only 1, and I was able to 
shutdown all the other processes, so that there were only about 50 procs 
running when I did the sysreq-t command, so I should have been able to 
capture all the output this time??? I've put the dumps here:

http://robm.fastmail.fm/kernel/t2/

Here's the relevant stuck proc.

imapd         D E17BE6E0     0  3761      1               10291 (NOTLB)
e11c3bc8 00000086 00000020 e17be6e0 c1372d20 00000246 00000220 f7e12380
       00000020 c0136667 c42c6da0 00000001 00000d74 bbfe8a6a 0000040d 
c42c6da0
       f7f91140 e17be6e0 e17be890 f78cd9cc 00000003 f78cd9cc f78cd9cc 
c025d2cc
Call Trace:
 [<c0136667>] kmem_cache_alloc+0x57/0x70
 [<c025d2cc>] generic_unplug_device+0x2c/0x40
 [<c037a148>] io_schedule+0x28/0x40
 [<c012e03c>] __lock_page+0xbc/0xe0
 [<c012dd70>] page_wake_function+0x0/0x50
 [<c012dd70>] page_wake_function+0x0/0x50
 [<c012f061>] filemap_nopage+0x231/0x360
 [<c013dc18>] do_no_page+0xb8/0x3a0
 [<c013ba7b>] pte_alloc_map+0xdb/0xf0
 [<c013e0ae>] handle_mm_fault+0xbe/0x1a0
 [<c025d292>] __generic_unplug_device+0x32/0x40
 [<c0112af2>] do_page_fault+0x172/0x5ec
 [<c014cab0>] bh_wake_function+0x0/0x40
 [<c014cab0>] bh_wake_function+0x0/0x40
 [<c018ec9f>] reiserfs_prepare_file_region_for_write+0x94f/0x9b0
 [<c0112980>] do_page_fault+0x0/0x5ec
 [<c0104b19>] error_code+0x2d/0x38
 [<c018dc0f>] reiserfs_copy_from_user_to_file_region+0x8f/0x100
 [<c018f2b1>] reiserfs_file_write+0x5b1/0x750
 [<c0186675>] reiserfs_link+0xb5/0x190
 [<c0186719>] reiserfs_link+0x159/0x190
 [<c016134c>] dput+0x1c/0x1b0
 [<c016134c>] dput+0x1c/0x1b0
 [<c01581a0>] path_release+0x10/0x40
 [<c015a9bc>] sys_link+0xcc/0xe0
 [<c014bb9a>] vfs_write+0xaa/0xe0
 [<c014b610>] default_llseek+0x0/0x110
 [<c014bc4f>] sys_write+0x2f/0x50
 [<c010406b>] syscall_call+0x7/0xb

Is that in lock_page again?

Hopefully there's some helpful information there. If the dump there isn't 
complete, can you give me an idea why it might not be? I've set the kernel 
buffer to 17 (128k), and the proc list was definitely small enough to fit in 
the buffer. When I did "dmesg -s 1000000 > foo", the first part of the file 
was still the original boot sequence. Any other suggestions on what to do?

Rob


  reply	other threads:[~2004-07-12 19:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-08 22:15 Processes stuck in unkillable D state (now seen in 2.6.7-mm6) Rob Mueller
2004-07-09 12:58 ` Chris Mason
2004-07-12 19:53   ` Rob Mueller [this message]
2004-07-12 20:11     ` William Lee Irwin III
2004-07-12 20:14       ` Rob Mueller
2004-07-12 20:25         ` William Lee Irwin III
2004-07-20 19:51     ` Chris Mason
2004-07-20 21:19       ` Rob Mueller
2004-07-15 16:12   ` Processes stuck in unkillable D state (now seen in 2.6.8-rc1) Rob Mueller
  -- strict thread matches above, loose matches on Subject: below --
2004-07-09  9:52 Processes stuck in unkillable D state (now seen in 2.6.7-mm6) Rob Mueller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='009e01c46849$f2e85430$9aafc742@ROBMHP' \
    --to=robm@fastmail.fm \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mason@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox