All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rob Mueller" <robm@fastmail.fm>
To: <linux-kernel@vger.kernel.org>
Cc: "Chris Mason" <mason@suse.com>
Subject: Processes stuck in unkillable D state (now seen in 2.6.7-mm6)
Date: Thu, 8 Jul 2004 15:15:12 -0700	[thread overview]
Message-ID: <00f601c46539$0bdf47a0$e6afc742@ROBMHP> (raw)

This is an update to a thread I started last week about processes getting 
stuck in D state.

About 2 days ago, we upgraded to 2.6.7-mm6. Things have generally been 
running fine, but today again, some processes got stuck in an unkillable D 
state. This time, rather than 1 process getting stuck however, about 20 got 
stuck in a relatively short period of time (seems to have been over about 
half an hour). All of processes are cyrus imapd processes.

I've tried to get sysreq-t output, but as this machine is still up and 
running, it has about 2500 processes on it, and I can't seem to get 
consistent sysreq-t output. I set the kernel log buffer size to 17 (128k) 
but that definitely doesn't seem to be enough. I notice that it also seems 
to dump to /var/log/messages, and I get more output there, but it still 
doesn't seem to be a complete process list, and each time I do a sysreq-t, I 
get a different number of procs (though always incomplete) in the output. 
Anyway, I've done sysreq-t twice, and got the output from dmesg -s 1000000 
and /var/log/messages. Since the output is so big, I've put them, and the 
kernel config here:

http://robm.fastmail.fm/kernel/t1/

Process ID's that are definitely stuck are:

1013, 13389, 13469, 16056, 17340, 18489, 21341, 22661, 23976, 29138, 29752, 
30330, 31106, 31956, 32559, 32575, 3753, 5926, 6052, 8857, 9914

But as mentioned above, you won't find most of these in the sysreq-t output, 
I presume because the buffer isn't big enough. Still, hopefully the ones you 
can see there will be some useful information. (FYI, searching for imapd\s+D 
in the sysreq-t output rather than the individual pids seems to be a quicker 
way of finding the problem procs)

Having a quick look myself, there are some odd things there though. For 
instance, from sysreqmsglog1.txt

   imapd         D F1778660     0  3753   1906          3754   809 (NOTLB)
   eb15adb8 00000086 00000020 f1778660 c0310318 c43fc600 08155888 0000002d
          f567d380 f7b97480 c42c3d20 00000000 0001ece6 6051d45f 00007c67 
c42c3d20
          c03d8180 f1778660 f1778810 f78ad9cc 00000003 f78ad9cc f78ad9cc 
c025d40c
   Call Trace:
    [<c0310318>] memcpy_fromiovec+0x38/0x60
    [<c025d40c>] generic_unplug_device+0x2c/0x40
    [<c037a288>] io_schedule+0x28/0x40
    [<c012e17c>] __lock_page+0xbc/0xe0
    [<c012deb0>] page_wake_function+0x0/0x50
    [<c012deb0>] page_wake_function+0x0/0x50
    [<c012f1a1>] filemap_nopage+0x231/0x360
    [<c013dd58>] do_no_page+0xb8/0x3a0
    [<c013bbbb>] pte_alloc_map+0xdb/0xf0
    [<c013e1ee>] handle_mm_fault+0xbe/0x1a0
    [<c0112c62>] do_page_fault+0x172/0x5ec
    [<c012435b>] do_sigaction+0x19b/0x210
    [<c0120dac>] update_process_times+0x2c/0x40
    [<c0110230>] smp_apic_timer_interrupt+0x140/0x150
    [<c0112af0>] do_page_fault+0x0/0x5ec
    [<c0104b19>] error_code+0x2d/0x38

   imapd         D E59812C0     0 22661   1906         23248 22592 (NOTLB)
   d54f5db8 00000086 f7b7de18 e59812c0 d54f5d94 c04b0dc0 00000020 00000000
          c42c3060 f71696f0 c42c3d20 00000000 0002cda6 891b682d 00007b15 
c42c3d20
          f71696f0 e59812c0 e5981470 00000003 c025d3bb f78ad9cc f78ad9cc 
c025d40c
   Call Trace:
    [<c025d3bb>] __generic_unplug_device+0x1b/0x40
    [<c025d40c>] generic_unplug_device+0x2c/0x40
    [<c037a288>] io_schedule+0x28/0x40
    [<c012e17c>] __lock_page+0xbc/0xe0
    [<c012deb0>] page_wake_function+0x0/0x50
    [<c012deb0>] page_wake_function+0x0/0x50
    [<c012f1a1>] filemap_nopage+0x231/0x360
    [<c013dd58>] do_no_page+0xb8/0x3a0
    [<c013bbbb>] pte_alloc_map+0xdb/0xf0
    [<c013e1ee>] handle_mm_fault+0xbe/0x1a0
    [<c0112af0>] do_page_fault+0x0/0x5ec
    [<c0104a5a>] apic_timer_interrupt+0x1a/0x20
    [<c0112c62>] do_page_fault+0x172/0x5ec
    [<c012435b>] do_sigaction+0x19b/0x210
    [<c0124693>] sys_rt_sigaction+0x53/0x90
    [<c030c631>] sys_socketcall+0x111/0x200
    [<c0112af0>] do_page_fault+0x0/0x5ec
    [<c0104b19>] error_code+0x2d/0x38

Those calls into "generic_unplug_device" look really strange to me...

Rob


             reply	other threads:[~2004-07-08 22:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-08 22:15 Rob Mueller [this message]
2004-07-09 12:58 ` Processes stuck in unkillable D state (now seen in 2.6.7-mm6) Chris Mason
2004-07-12 19:53   ` Rob Mueller
2004-07-12 20:11     ` William Lee Irwin III
2004-07-12 20:14       ` Rob Mueller
2004-07-12 20:25         ` William Lee Irwin III
2004-07-20 19:51     ` Chris Mason
2004-07-20 21:19       ` Rob Mueller
2004-07-15 16:12   ` Processes stuck in unkillable D state (now seen in 2.6.8-rc1) Rob Mueller
  -- strict thread matches above, loose matches on Subject: below --
2004-07-09  9:52 Processes stuck in unkillable D state (now seen in 2.6.7-mm6) Rob Mueller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='00f601c46539$0bdf47a0$e6afc742@ROBMHP' \
    --to=robm@fastmail.fm \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mason@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.