* 2.6.1 IO lockup on SMP systems
@ 2004-01-31 16:40 Sergey S. Kostyliov
2004-02-01 0:17 ` Andrew Morton
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-01-31 16:40 UTC (permalink / raw)
To: linux-kernel; +Cc: anton
Hello all,
I had experienced a lockups on three of my servers with 2.6.1. It doesn't
look like a deadlock, the box is still pingable and all tcp ports which were
in listen state before a lockup are remains in listen state, but I can't get
any data from this ports. According to sar(1) systems had not been overloaded
right before a lockup. And there is no log entries in all user services logs
for almost 10 hours after lockup.
So I think this is an IO lockup. On the other side it doesn't look like a bug
in particular controller driver, because they are different for each box.
And finally it doesn't look like a bug in particular io-scheduler because two
of boxes were runed with "deadline" and one with "as". Of course all
assumptions are valid only if all lockups I had seen have the same nature.
All of three boxes are SMP. Unfortunately all are remote and aren't attached
to a serial console yet (this is planed in next couple of weeks).
1) ope
01:02.1 RAID bus controller: Mylex Corporation: Unknown device 0050 (rev 02)
elevator=deadline
.config: http://sysadminday.org.ru/2.6.1-io_lockup/ope/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci_-vvn
2) white
02:04.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 02)
elevator=deadline
.config: http://sysadminday.org.ru/2.6.1-io_lockup/white/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci_-vvn
3) tiny
02:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
03:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
elevator=as
.config: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci_-vvn
Any hints will be appreciated.
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-01-31 16:40 2.6.1 IO lockup on SMP systems Sergey S. Kostyliov
@ 2004-02-01 0:17 ` Andrew Morton
2004-02-21 16:45 ` Sergey S. Kostyliov
0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2004-02-01 0:17 UTC (permalink / raw)
To: Sergey S. Kostyliov; +Cc: linux-kernel, anton
"Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> I had experienced a lockups on three of my servers with 2.6.1. It doesn't
> look like a deadlock, the box is still pingable and all tcp ports which were
> in listen state before a lockup are remains in listen state, but I can't get
> any data from this ports. According to sar(1) systems had not been overloaded
> right before a lockup. And there is no log entries in all user services logs
> for almost 10 hours after lockup.
Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks
backtrace or a locked machine with sysrq-T or `echo t >
/proc/sysrq-trigger'. Then send us the resulting trace.
You may need a serial console to be able to capture all the output.
Also, it would be useful to know what sort of load the machines are under,
and what filesystems are in use.
Thanks.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-01 0:17 ` Andrew Morton
@ 2004-02-21 16:45 ` Sergey S. Kostyliov
2004-02-21 19:30 ` Andrew Morton
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-21 16:45 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, anton
Hello Andrew,
On Sunday 01 February 2004 03:17, Andrew Morton wrote:
> "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> >
> > I had experienced a lockups on three of my servers with 2.6.1. It doesn't
> > look like a deadlock, the box is still pingable and all tcp ports which were
> > in listen state before a lockup are remains in listen state, but I can't get
> > any data from this ports. According to sar(1) systems had not been overloaded
> > right before a lockup. And there is no log entries in all user services logs
> > for almost 10 hours after lockup.
>
> Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks
> backtrace or a locked machine with sysrq-T or `echo t >
> /proc/sysrq-trigger'. Then send us the resulting trace.
I've just reproduced this lockup with 2.6.3.
>
> You may need a serial console to be able to capture all the output.
>
> Also, it would be useful to know what sort of load the machines are under,
> and what filesystems are in use.
The machine is a http server. The main applications are:
1) apache 1.3 which serves php pages (mod_php):
15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
54 requests currently being processed, 19 idle servers
2) mysql:
Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
This is an IO bound machine in general. All filesystems are reiserfs.
Here is a sysrq-T output obtained from a locked box via serail console:
SysRq : Show State
free sibling
task PC stack pid father child younger older
init D 28E916FC 24 1 0 2 (NOTLB)
c244fcf0 00000086 d8460080 28e916fc 00003243 c2422bc0 f77fbd00 00000096
d8460080 2ede4081 00003243 c02af980 00000001 2ede4181 00003243 d8460080
d84600a0 c2422bc0 000017a2 2ede43e1 00003243 c244dac8 03471525 c244fd04
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c013e6a9>] filemap_nopage+0x329/0x3d0
[<c0157728>] read_swap_cache_async+0xb8/0xd0
[<c014c903>] swapin_readahead+0x43/0x90
[<c014cb98>] do_swap_page+0x248/0x320
[<c014d4d0>] handle_mm_fault+0xe0/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0171334>] sys_select+0x264/0x520
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
migration/0 S 00000001 12 2 1 3 (L-TLB)
c245dfc4 00000046 c241abc0 00000001 00000003 c245df98 c02ab560 f7da6d00
69378bf7 247a42e0 00000000 e11a4de2 f77e1f58 c245c000 f77e1f50 00000292
c245dfc4 c241abc0 00001801 e11a54a6 00000001 c244ce68 c241b4ec c245c000
Call Trace:
[<c011f4af>] migration_thread+0xdf/0x160
[<c011f3d0>] migration_thread+0x0/0x160
[<c0106f79>] kernel_thread_helper+0x5/0xc
ksoftirqd/0 S 00000001 24 3 1 4 2 (L-TLB)
c245bfd8 00000046 c241abc0 00000001 00000003 f3fc8ca8 f63f62e0 c241c54c
c245bf94 c245bf94 c241c55c 00000000 cebbd940 253f991d 000019b1 f5f766d0
f5f766f0 c241abc0 0000010b 8d075105 000019ea c244c838 c245a000 c245a000
Call Trace:
[<c0126d22>] ksoftirqd+0xe2/0x100
[<c0126c40>] ksoftirqd+0x0/0x100
[<c0106f79>] kernel_thread_helper+0x5/0xc
migration/1 S 00000001 8 4 1 5 3 (L-TLB)
c2459fc4 00000046 c2422bc0 00000001 00000003 c02aeedc c02ab560 c0336c60
c012bb70 c02aeedc c02aeed8 c2458000 c0123735 00000082 c02aba30 00000008
c2458000 c2422bc0 00004b63 0295e0d2 00000000 c244c208 c24234ec c2458000
Call Trace:
[<c012bb70>] free_uid+0x20/0x90
[<c0123735>] reparent_to_init+0x105/0x1a0
[<c011f4af>] migration_thread+0xdf/0x160
[<c011f3d0>] migration_thread+0x0/0x160
[<c0106f79>] kernel_thread_helper+0x5/0xc
ksoftirqd/1 S C0134355 24 5 1 6 4 (L-TLB)
c2455fd8 00000046 c03385e0 c0134355 02002bfd cad564c8 ed2ed0e0 c242454c
c2455f94 c2455f94 c242455c 00000000 c2454000 c033759c c0126a03 eca2f350
eca2f370 c2422bc0 0000025a 31392d56 000019e4 c2457ae8 c2454000 c2454000
Call Trace:
[<c0134355>] rcu_process_callbacks+0x155/0x190
[<c0126a03>] tasklet_action+0x73/0xe0
[<c0126d22>] ksoftirqd+0xe2/0x100
[<c0126c40>] ksoftirqd+0x0/0x100
[<c0106f79>] kernel_thread_helper+0x5/0xc
events/0 S 00000001 0 6 1 14588 7 5 (L-TLB)
f7f93f70 00000046 c241abc0 00000001 00000003 0000000b f77fb8c0 c02b8124
00000246 c241b520 c0353e40 00000000 f7fcbbe4 f7f92000 f7fcbbe0 00000092
f7f93f70 c241abc0 000001c9 25a4fd1b 00003243 c24574b8 f7f92000 f7fcbbcc
Call Trace:
[<c01333e5>] worker_thread+0x285/0x2b0
[<c01e5a60>] console_callback+0x0/0xe0
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
events/1 S ED2ED520 0 7 1 8 6 (L-TLB)
f7f91f70 00000046 00000000 ed2ed520 00000000 e07e9e88 ed2ed520 00000000
00000000 f630a080 0000007b 0000007b f630a080 f630a0a0 c2422bc0 f6258d20
f6258d40 c2422bc0 0000006b ecdabbfb 000019ef c2456e88 f7f90000 f7fcbc2c
Call Trace:
[<c01333e5>] worker_thread+0x285/0x2b0
[<c0132e00>] __call_usermodehelper+0x0/0x70
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
kblockd/0 S 00000001 24 8 1 9 7 (L-TLB)
c2527f70 00000046 c241abc0 00000001 00000003 00000001 f776f2a0 f7fa8000
c02027ec c2772e00 f3dbde28 f7c37834 f7fcb3a4 c2526000 f7fcb3a0 00000092
c2527f70 c241abc0 0000067d 03fc798c 00002b4f c2456858 c2526000 f7fcb38c
Call Trace:
[<c02027ec>] DAC960_process_queue+0x1c/0x170
[<c01333e5>] worker_thread+0x285/0x2b0
[<c01f4670>] blk_unplug_work+0x0/0x20
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
kblockd/1 S C2772E00 8 9 1 13 8 (L-TLB)
c2525f70 00000046 c01f29d6 c2772e00 00000003 00000003 d1a49ae0 f7fa8000
c02027ec c2772e00 ecffc5d8 c2763c60 f7fcb404 c2524000 f7fcb400 c25026b0
c25026d0 c2422bc0 00000961 ff665670 00002b4e c2456228 c2524000 f7fcb3ec
Call Trace:
[<c01f29d6>] elv_next_request+0x16/0x110
[<c02027ec>] DAC960_process_queue+0x1c/0x170
[<c01333e5>] worker_thread+0x285/0x2b0
[<c01f4670>] blk_unplug_work+0x0/0x20
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
kswapd0 S C23FFE98 0 13 1 10 9 (L-TLB)
f7dfbf04 00000046 c23fff38 c23ffe98 000000d0 00000200 f77a06c0 c02b0280
00000002 00000000 c0149200 00000100 c02b0280 000000d0 00000200 f72ecce0
f72ecd00 c2422bc0 0000b6a2 df9ac558 00003243 c2502878 f7dfa000 f7dfbf20
Call Trace:
[<c0149200>] balance_pgdat+0x1c0/0x250
[<c014939b>] kswapd+0x10b/0x160
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0109192>] ret_from_fork+0x6/0x14
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0149290>] kswapd+0x0/0x160
[<c0106f79>] kernel_thread_helper+0x5/0xc
kirqd S 00000001 8 10 1 14 13 (L-TLB)
c2501fa0 00000046 c2422bc0 00000001 00000003 00000000 d1a49040 00000000
c0109c28 00000000 000000d5 005d2025 c244d2d0 4926873b 03471a9a f77ac6f0
f77ac710 c2422bc0 000006fd 881c4eb1 00003243 c2503b08 03472e23 c2501fb4
Call Trace:
[<c0109c28>] common_interrupt+0x18/0x20
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c012b5b0>] process_timeout+0x0/0x10
[<c0118ee7>] balanced_irq+0x57/0x80
[<c0118e90>] balanced_irq+0x0/0x80
[<c0106f79>] kernel_thread_helper+0x5/0xc
aio/0 S 00000082 0 14 1 15 10 (L-TLB)
f7da9f70 00000046 00000001 00000082 00000001 c244ff68 c02ab560 f7da9f4c
c011d93a c244d900 00000003 00000000 c244ff68 f7da8000 00010000 c244d900
c244d920 c241abc0 000027fb 1965fa0c 00000000 c2502248 f7da8000 00000000
Call Trace:
[<c011d93a>] __wake_up_common+0x3a/0x70
[<c01333e5>] worker_thread+0x285/0x2b0
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
aio/1 S 00000001 0 15 1 16 14 (L-TLB)
f7da5f70 00000046 c2422bc0 00000001 00000003 c244ff68 c02ab560 f7da5f4c
c011d93a c244d900 00000003 00000000 c244ff68 f7da4000 00010000 f7da7960
f7dd7c04 c2422bc0 0000241f 19668d09 00000000 f7da7b28 f7da4000 00000000
Call Trace:
[<c011d93a>] __wake_up_common+0x3a/0x70
[<c01333e5>] worker_thread+0x285/0x2b0
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
kseriod S 00000001 1688 16 1 17 15 (L-TLB)
c26c9fb0 00000046 c2422bc0 00000001 00000003 e2024870 f68ad760 00000002
e2024000 012aeedc e20248b0 c02be660 c02be8a0 c02be820 00000286 c021ab1a
c02be8a0 c2422bc0 018bda82 3f31580d 0000317e f7da6268 c26c8000 ffffe000
Call Trace:
[<c021ab1a>] serio_find_dev+0x6a/0x70
[<c021adb6>] serio_thread+0x146/0x180
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c021ac70>] serio_thread+0x0/0x180
[<c0106f79>] kernel_thread_helper+0x5/0xc
reiserfs/0 S 00000003 0 17 1 18 16 (L-TLB)
c2697f70 00000046 f880b38c 00000003 00000001 00000000 ecec66e0 f880b398
f880b34c 00000292 c01b824f f8831c20 c26dce44 c2696000 c26dce40 cdc9aca0
cdc9acc0 c241abc0 00001bcf c3d3faeb 00001a46 f7da6898 c2696000 c26dce2c
Call Trace:
[<c01b824f>] kupdate_one_transaction+0x12f/0x250
[<c01333e5>] worker_thread+0x285/0x2b0
[<c01b97c0>] reiserfs_journal_commit_task_func+0x0/0x100
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
reiserfs/1 S 00000003 0 18 1 23 17 (L-TLB)
f77e1f70 00000046 f8a272ec 00000003 00000001 00000000 f776fb20 00000000
f8a272ac f77a11f0 c01b824f f77a11f0 c26dcea4 f77e0000 c26dcea0 f38aace0
f38aad00 c2422bc0 00000448 e234458a 00001a55 f7da6ec8 f77e0000 c26dce8c
Call Trace:
[<c01b824f>] kupdate_one_transaction+0x12f/0x250
[<c01333e5>] worker_thread+0x285/0x2b0
[<c01b97c0>] reiserfs_journal_commit_task_func+0x0/0x100
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0109192>] ret_from_fork+0x6/0x14
[<c011d8e0>] default_wake_function+0x0/0x20
[<c0133160>] worker_thread+0x0/0x2b0
[<c0106f79>] kernel_thread_helper+0x5/0xc
devfsd D 25935C12 16 23 1 610 18 (NOTLB)
f7683bcc 00000086 f5f6b980 25935c12 00003243 c241abc0 f77fb8c0 00000096
f5f6b980 25935ac8 00003243 c02af980 00000001 25935c12 00003243 f5f6b980
f5f6b9a0 c241abc0 00002372 25935f22 00003243 f7757538 03471489 f7683be0
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0157728>] read_swap_cache_async+0xb8/0xd0
[<c014c903>] swapin_readahead+0x43/0x90
[<c014cb98>] do_swap_page+0x248/0x320
[<c014d4d0>] handle_mm_fault+0xe0/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0141740>] __alloc_pages+0xa0/0x350
[<c011b858>] recalc_task_prio+0xa8/0x1d0
[<c011d503>] schedule+0x373/0x700
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
[<c01cc1f6>] __copy_to_user_ll+0x46/0x80
[<c01c161e>] devfsd_read+0x42e/0x4e0
[<c011d8e0>] default_wake_function+0x0/0x20
[<c011d8e0>] default_wake_function+0x0/0x20
[<c015c4a8>] vfs_read+0xb8/0x130
[<c015c752>] sys_read+0x42/0x70
[<c01092bb>] syscall_call+0x7/0xb
syslogd D 00000001 0 610 1 616 23 (NOTLB)
f71cdcf0 00000086 c241abc0 00000001 00000003 c2422bc0 f77a04a0 00000096
f5f6b980 24068d22 00003243 c02af980 00000001 00000096 f71cc000 f71cc000
f71cdd04 c241abc0 00001cf0 2406959c 00003243 f7da74f8 0347146f f71cdd04
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c013e6a9>] filemap_nopage+0x329/0x3d0
[<c0157728>] read_swap_cache_async+0xb8/0xd0
[<c014c903>] swapin_readahead+0x43/0x90
[<c014cb98>] do_swap_page+0x248/0x320
[<c014d4d0>] handle_mm_fault+0xe0/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0125b1e>] do_setitimer+0x1be/0x1f0
[<c01086c0>] sys_sigreturn+0xf0/0x110
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
klogd D B043E3E1 0 616 1 699 610 (NOTLB)
f7769cf0 00000086 d8460080 b6390e66 00003244 c2422bc0 f7306d60 00000096
d8460080 b6390d6f 00003244 c02af980 00000001 b6390e66 00003244 d8460080
d84600a0 c2422bc0 00001736 bc2e3c36 00003244 f7628838 03472f32 f7769d04
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0157728>] read_swap_cache_async+0xb8/0xd0
[<c014c903>] swapin_readahead+0x43/0x90
[<c014cb98>] do_swap_page+0x248/0x320
[<c014d4d0>] handle_mm_fault+0xe0/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c011d8e0>] default_wake_function+0x0/0x20
[<c012b4b0>] do_timer+0xc0/0xd0
[<c015c4c2>] vfs_read+0xd2/0x130
[<c015c752>] sys_read+0x42/0x70
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
ntpd S 00000001 16 699 1 718 616 (NOTLB)
f735deb0 00000086 c2422bc0 00000001 00000003 00000000 f77a06c0 c02afe00
00000000 00000000 08098478 00000000 f72ecce0 00000010 c02b0700 00000000
000000d0 c2422bc0 00000260 ce28bcca 00003244 f72ecea8 00000000 7fffffff
Call Trace:
[<c012b67e>] schedule_timeout+0xbe/0xc0
[<c022485b>] datagram_poll+0x2b/0xca
[<c021e809>] sock_poll+0x29/0x40
[<c0170f21>] do_select+0x1a1/0x310
[<c0170bb0>] __pollwait+0x0/0xd0
[<c01713cb>] sys_select+0x2fb/0x520
[<c01092bb>] syscall_call+0x7/0xb
sshd D EBC82985 0 718 1 1051 741 699 (NOTLB)
f77bfc84 00000082 d8460080 ebc82985 00003244 c2422bc0 f77fb040 00000082
d8460080 ebc82884 00003244 c02af980 00000001 f1bd4c48 00003244 d8460080
d84600a0 c2422bc0 00001749 f1bd4e95 00003244 f71f3b28 034732b5 f77bfc98
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0143bc2>] do_page_cache_readahead+0x172/0x1e0
[<c013e511>] filemap_nopage+0x191/0x3d0
[<c013e380>] filemap_nopage+0x0/0x3d0
[<c014cfd3>] do_no_page+0xd3/0x3c0
[<c014acc7>] pte_alloc_map+0xc7/0x110
[<c014d4f6>] handle_mm_fault+0x106/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0171334>] sys_select+0x264/0x520
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
xinetd D 00000001 0 741 1 758 718 (NOTLB)
f7627c84 00000086 c241abc0 00000001 00000003 f7626000 f77fb6a0 212e541d
00003243 f5f6b980 f5f6b9a0 c241abc0 00015dbd 212e559d 00003243 f7626000
f7627c98 c241abc0 000004c7 212e61bf 00003243 f72ff4b8 03471440 f7627c98
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0143bc2>] do_page_cache_readahead+0x172/0x1e0
[<c01f5d56>] generic_make_request+0x106/0x190
[<c013e511>] filemap_nopage+0x191/0x3d0
[<c013e380>] filemap_nopage+0x0/0x3d0
[<c014cfd3>] do_no_page+0xd3/0x3c0
[<c014acc7>] pte_alloc_map+0xc7/0x110
[<c014d4f6>] handle_mm_fault+0x106/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c01cc1fa>] __copy_to_user_ll+0x4a/0x80
[<c0171334>] sys_select+0x264/0x520
[<c011d8e0>] default_wake_function+0x0/0x20
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
svscan D 00000001 0 758 1 759 785 741 (NOTLB)
f736fcf0 00000082 c241abc0 00000001 00000003 c2422bc0 f776f080 00000096
d8460080 24345b50 00003243 c02af980 00000001 00000096 f736e000 f736e000
f736fd04 c241abc0 00001e2b 24346317 00003243 f72ec878 03471472 f736fd04
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c013e6a9>] filemap_nopage+0x329/0x3d0
[<c0157728>] read_swap_cache_async+0xb8/0xd0
[<c014c903>] swapin_readahead+0x43/0x90
[<c014cb98>] do_swap_page+0x248/0x320
[<c014d4d0>] handle_mm_fault+0xe0/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c012b636>] schedule_timeout+0x76/0xc0
[<c013027e>] sys_rt_sigaction+0xfe/0x120
[<c012b5b0>] process_timeout+0x0/0x10
[<c012b85e>] sys_nanosleep+0x10e/0x1c0
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
supervise D 00000001 0 759 758 761 760 (NOTLB)
f6e4fcf0 00000086 c2422bc0 00000001 00000003 f6e4e000 f77a0060 606e7502
00003245 d8460080 d84600a0 c2422bc0 0000dc5d 606e7682 00003245 f6e4e000
f6e4fd04 c2422bc0 00000496 6672deb3 00003245 f72ffae8 03473a5b f6e4fd04
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c013e6a9>] filemap_nopage+0x329/0x3d0
[<c0157728>] read_swap_cache_async+0xb8/0xd0
[<c014c903>] swapin_readahead+0x43/0x90
[<c014cb98>] do_swap_page+0x248/0x320
[<c014d4d0>] handle_mm_fault+0xe0/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0170ba4>] poll_freewait+0x44/0x50
[<c01719d2>] sys_poll+0x272/0x2c0
[<c0170bb0>] __pollwait+0x0/0xd0
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
supervise D 00000001 0 760 758 762 759 (NOTLB)
f6e4dc84 00000082 c241abc0 00000001 00000003 c2422bc0 f7306b40 00000082
f5f6b980 211bd57b 00003243 c02af980 00000001 00000082 f6e4c000 f5f83310
f5f83330 c241abc0 0001b8e4 211bdfd4 00003243 f72fe228 0347143e f6e4dc98
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0143bc2>] do_page_cache_readahead+0x172/0x1e0
[<c013e511>] filemap_nopage+0x191/0x3d0
[<c013e380>] filemap_nopage+0x0/0x3d0
[<c014cfd3>] do_no_page+0xd3/0x3c0
[<c014acc7>] pte_alloc_map+0xc7/0x110
[<c014d4f6>] handle_mm_fault+0x106/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0170ba4>] poll_freewait+0x44/0x50
[<c01719d2>] sys_poll+0x272/0x2c0
[<c0170bb0>] __pollwait+0x0/0xd0
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
dnscache D A631E98B 0 761 759 (NOTLB)
f6e2bc84 00000086 d8460080 ac271597 00003245 c2422bc0 f776f2a0 00000082
d8460080 ac271452 00003245 c02af980 00000001 ac271597 00003245 d8460080
d84600a0 c2422bc0 00001661 ac27189b 00003245 f77ad518 03473f51 f6e2bc98
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0143bc2>] do_page_cache_readahead+0x172/0x1e0
[<c013e511>] filemap_nopage+0x191/0x3d0
[<c013e380>] filemap_nopage+0x0/0x3d0
[<c014cfd3>] do_no_page+0xd3/0x3c0
[<c014acc7>] pte_alloc_map+0xc7/0x110
[<c014d4f6>] handle_mm_fault+0x106/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0170ba4>] poll_freewait+0x44/0x50
[<c01719d2>] sys_poll+0x272/0x2c0
[<c0170bb0>] __pollwait+0x0/0xd0
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
multilog S 00000001 4036 762 760 (NOTLB)
f6de9eb4 00000086 c2422bc0 00000001 00000003 c013d359 f77a0b00 00000000
f7629900 c1a5c288 c02b0e40 c1a5c288 0001b38a 0001b38a 00000292 c0157bdf
c034dac0 c2422bc0 00000d4b d73a9523 00001a38 f7629ac8 f739b66c f739b600
Call Trace:
[<c013d359>] __lock_page+0xb9/0xd0
[<c0157bdf>] swap_free+0x2f/0x50
[<c0169f8e>] pipe_wait+0x7e/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c016a19f>] pipe_readv+0x1ef/0x2f0
[<c016a2d8>] pipe_read+0x38/0x40
[<c015c4a8>] vfs_read+0xb8/0x130
[<c012f0af>] sys_rt_sigprocmask+0xbf/0x190
[<c015c752>] sys_read+0x42/0x70
[<c01092bb>] syscall_call+0x7/0xb
httpd D 24562C9D 0 785 1 2898 828 758 (NOTLB)
f72a9c84 00000082 00000000 24562c9d 00003243 c2422bc0 f776fb20 00000082
f5f6b980 24562c9d 00003243 c02af980 00000001 00000082 f72a8000 f7756d40
f7756d60 c241abc0 000072d3 24563c20 00003243 f72fe858 03471475 f72a9c98
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0143bc2>] do_page_cache_readahead+0x172/0x1e0
[<c013d1b5>] unlock_page+0x15/0x60
[<c013e511>] filemap_nopage+0x191/0x3d0
[<c013e380>] filemap_nopage+0x0/0x3d0
[<c014cfd3>] do_no_page+0xd3/0x3c0
[<c0157bdf>] swap_free+0x2f/0x50
[<c014acc7>] pte_alloc_map+0xc7/0x110
[<c014d4f6>] handle_mm_fault+0x106/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0170bb0>] __pollwait+0x0/0xd0
[<c0171334>] sys_select+0x264/0x520
c0109d25>] error_code+0x2d/0x38
mysqld_safe S 00000000 0 835 1 924 851 828 (NOTLB)
f693df50 00000086 7dda6067 00000000 f696bb60 f696bb80 f696bb60 f77acd20
c011a94c f696bb60 f69a9620 080c9f9c 00000001 00000001 080c9f9c f71f26d0
f71f26f0 c241abc0 0001663f 2f2e593c 00000008 f77acee8 fffffe00 f693c000
Call Trace:
[<c011a94c>] do_page_fault+0x33c/0x530
[<c01254ab>] sys_wait4+0x1bb/0x290
[<c011d8e0>] default_wake_function+0x0/0x20
[<c012f0f3>] sys_rt_sigprocmask+0x103/0x190
[<c011d8e0>] default_wake_function+0x0/0x20
[<c01092bb>] syscall_call+0x7/0xb
qmail-send D 3F7BF822 0 851 1 864 900 835 (NOTLB)
f7603c84 00000086 d8460080 457110f1 00003246 c2422bc0 f696b0c0 00000082
d8460080 45710fc1 00003246 c02af980 00000001 457110f1 00003246 d8460080
d84600a0 c2422bc0 00001710 4b664525 00003246 f72ec248 0347495e f7603c98
Call Trace:
[<c012b62c>] schedule_timeout+0x6c/0xc0
[<c0142b11>] wakeup_bdflush+0x21/0x40
[<c012b5b0>] process_timeout+0x0/0x10
[<c011eb7b>] io_schedule_timeout+0x2b/0x40
[<c01f54a4>] blk_congestion_wait+0x84/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c0148f9f>] try_to_free_pages+0xef/0x190
[<c014187c>] __alloc_pages+0x1dc/0x350
[<c0143bc2>] do_page_cache_readahead+0x172/0x1e0
[<c014ca8f>] do_swap_page+0x13f/0x320
[<c013e511>] filemap_nopage+0x191/0x3d0
[<c013e380>] filemap_nopage+0x0/0x3d0
[<c014cfd3>] do_no_page+0xd3/0x3c0
[<c014acc7>] pte_alloc_map+0xc7/0x110
[<c014d4f6>] handle_mm_fault+0x106/0x1b0
[<c011a94c>] do_page_fault+0x33c/0x530
[<c0171334>] sys_select+0x264/0x520
[<c011a610>] do_page_fault+0x0/0x530
[<c0109d25>] error_code+0x2d/0x38
splogger S 114C3021 5660 864 851 865 (NOTLB)
f684beb4 00000086 f7da7330 114c3021 02002c19 00000003 f68addc0 00000009
f684bea4 00000000 c021dd7c f684d9a0 00000000 f7140580 f684bf90 f7da7330
f7da7350 c241abc0 00000a82 c14af7c2 000019e4 f684db68 c26e038c c26e0320
Call Trace:
[<c021dd7c>] sockfd_lookup+0x1c/0x80
[<c0169f8e>] pipe_wait+0x7e/0xa0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c01790da>] update_atime+0x9a/0xe0
[<c011fda0>] autoremove_wake_function+0x0/0x50
[<c016a19f>] pipe_readv+0x1ef/0x2f0
[<c016a2d8>] pipe_read+0x38/0x40
[<c015c4a8>] vfs_read+0xb8/0x130
[<c010fa00>] do_gettimeofday+0x20/0xc0
[<c015c752>] sys_read+0x42/0x70
[<c01092bb>] syscall_call+0x7/0xb
qmail-lspawn S 00000001 0 865 851 866 864 (NOTLB)
f685feb0 00000086 c241abc0 00000001 00000003 c138a310 f689b960 000001d5
f685ff40 00000000 c0141740 c02afe00 00000000 00000000 c14176c1 00000000
f71f2d00 c241abc0 000028b6 c141be0d 000019e4 f71f2ec8 00000000 7fffffff
Call Trace:
[<c0141740>] __alloc_pages+0xa0/0x350
[<c012b67e>] schedule_timeout+0xbe/0xc0
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-21 16:45 ` Sergey S. Kostyliov
@ 2004-02-21 19:30 ` Andrew Morton
2004-02-22 17:39 ` Alexander Y. Fomichev
0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2004-02-21 19:30 UTC (permalink / raw)
To: Sergey S. Kostyliov; +Cc: linux-kernel, anton
"Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> Hello Andrew,
>
> On Sunday 01 February 2004 03:17, Andrew Morton wrote:
> > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> > >
> > > I had experienced a lockups on three of my servers with 2.6.1. It doesn't
> > > look like a deadlock, the box is still pingable and all tcp ports which were
> > > in listen state before a lockup are remains in listen state, but I can't get
> > > any data from this ports. According to sar(1) systems had not been overloaded
> > > right before a lockup. And there is no log entries in all user services logs
> > > for almost 10 hours after lockup.
> >
> > Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks
> > backtrace or a locked machine with sysrq-T or `echo t >
> > /proc/sysrq-trigger'. Then send us the resulting trace.
>
> I've just reproduced this lockup with 2.6.3.
>
> >
> > You may need a serial console to be able to capture all the output.
> >
> > Also, it would be useful to know what sort of load the machines are under,
> > and what filesystems are in use.
>
> The machine is a http server. The main applications are:
> 1) apache 1.3 which serves php pages (mod_php):
> 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
> 54 requests currently being processed, 19 idle servers
> 2) mysql:
> Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
> Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
>
> This is an IO bound machine in general. All filesystems are reiserfs.
>
> Here is a sysrq-T output obtained from a locked box via serail console:
OK, so everything is stuck trying to allocate memory. Perhaps you ran out
of swapspace, or some process has gone berzerk allocating memory.
How much memory does the machine have, and how much swap space?
I suggest that you run a `vmstat 30' trace on a terminal somewhere, see what
it says prior to the hangs. Also capture the sysrq-M output after it has
hung.
It would be useful to monitor the contents of /proc/vmstat also.
And perhaps keep top running in `sort by memory usage' mode.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-21 19:30 ` Andrew Morton
@ 2004-02-22 17:39 ` Alexander Y. Fomichev
2004-02-23 17:27 ` Sergey S. Kostyliov
0 siblings, 1 reply; 24+ messages in thread
From: Alexander Y. Fomichev @ 2004-02-22 17:39 UTC (permalink / raw)
To: Andrew Morton; +Cc: Sergey S. Kostyliov, linux-kernel, anton
On Saturday 21 February 2004 22:30, Andrew Morton wrote:
> "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> > Hello Andrew,
> >
> > On Sunday 01 February 2004 03:17, Andrew Morton wrote:
> > > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> > > > I had experienced a lockups on three of my servers with 2.6.1. It
> > > > doesn't look like a deadlock, the box is still pingable and all tcp
> > > > ports which were in listen state before a lockup are remains in
> > > > listen state, but I can't get any data from this ports. According to
> > > > sar(1) systems had not been overloaded right before a lockup. And
> > > > there is no log entries in all user services logs for almost 10 hours
> > > > after lockup.
> > >
> > > Please ensure that CONFIG_KALLSYMS is enabled, then generate an
> > > all-tasks backtrace or a locked machine with sysrq-T or `echo t >
> > > /proc/sysrq-trigger'. Then send us the resulting trace.
> >
> > I've just reproduced this lockup with 2.6.3.
> >
> > > You may need a serial console to be able to capture all the output.
> > >
> > > Also, it would be useful to know what sort of load the machines are
> > > under, and what filesystems are in use.
> >
> > The machine is a http server. The main applications are:
> > 1) apache 1.3 which serves php pages (mod_php):
> > 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
> > 54 requests currently being processed, 19 idle servers
> > 2) mysql:
> > Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
> > Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
> >
> > This is an IO bound machine in general. All filesystems are reiserfs.
> >
> > Here is a sysrq-T output obtained from a locked box via serail console:
>
> OK, so everything is stuck trying to allocate memory. Perhaps you ran out
> of swapspace, or some process has gone berzerk allocating memory.
>
> How much memory does the machine have, and how much swap space?
>
# free
total used free shared buffers cached
Mem: 2073868 2067508 6360 0 232708 897828
-/+ buffers/cache: 936972 1136896
Swap: 1535976 5228 1530748
> I suggest that you run a `vmstat 30' trace on a terminal somewhere, see
> what it says prior to the hangs.
Ok.We'll try to get it next time.
> Also capture the sysrq-M output after it
> has hung.
>
This "showmem" && "showreg" have been taken just before
"SysRq: Show State" from previous message.
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Free pages: 3172kB (512kB HighMem)
Active:1783 inactive:87 dirty:0 writeback:0 unstable:0 free:793
DMA free:1292kB min:16kB low:32kB high:48kB active:3748kB inactive:0kB
Normal free:1368kB min:936kB low:1872kB high:2808kB active:1368kB
inactive:356kB
HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB
inactive:0kB
DMA: 151*4kB 70*8kB 6*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = B
Normal: 192*4kB 9*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB
0*1024kB 0*2048kB 0*4096kB B
HighMem: 0*4kB 2*8kB 3*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB =B
Swap cache: add 1140128, delete 1140063, find 459572/584559, race 145+217
Free swap: 384364kB
524288 pages of RAM
294912 pages of HIGHMEM
5821 reserved pages
976 pages shared
65 pages swap cached
SysRq : Show Regs
Pid: 0, comm: swapper
EIP: 0060:[<c0106d1c>] CPU: 0
EIP is at default_idle+0x2c/0x40
EFLAGS: 00000246 Not tainted
EAX: 00000000 EBX: c02e6000 ECX: c0106cf0 EDX: c02e6000
ESI: c02e6000 EDI: c0105000 EBP: 0008e000 DS: 007b ES: 007b
CR0: 8005003b CR2: bffff7e0 CR3: 2d021000 CR4: 00000690
Call Trace:
[<c0106dab>] cpu_idle+0x3b/0x50
[<c02e88e9>] start_kernel+0x179/0x1a0
[<c02e84a0>] unknown_bootoption+0x0/0x120
> It would be useful to monitor the contents of /proc/vmstat also.
>
> And perhaps keep top running in `sort by memory usage' mode.
ok, we'll try too.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
< on behalf of "Sergey S. Kostyliov" <rathamahata@php4.ru> >
Best regards.
Alexander Y. Fomichev <gluk@php4.ru>
Public PGP key: http://sysadminday.org.ru/gluk.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-22 17:39 ` Alexander Y. Fomichev
@ 2004-02-23 17:27 ` Sergey S. Kostyliov
2004-02-23 21:30 ` Mike Fedyk
2004-02-23 22:26 ` Andrew Morton
0 siblings, 2 replies; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-23 17:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, Alexander Y. Fomichev, anton
Hello Andrew,
Now this happens for the third time.
> > > I've just reproduced this lockup with 2.6.3.
> > >
> > > > You may need a serial console to be able to capture all the output.
> > > >
> > > > Also, it would be useful to know what sort of load the machines are
> > > > under, and what filesystems are in use.
> > >
> > > The machine is a http server. The main applications are:
> > > 1) apache 1.3 which serves php pages (mod_php):
> > > 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
> > > 54 requests currently being processed, 19 idle servers
> > > 2) mysql:
> > > Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
> > > Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
> > >
> > > This is an IO bound machine in general. All filesystems are reiserfs.
> > >
> > > Here is a sysrq-T output obtained from a locked box via serail console:
> >
> > OK, so everything is stuck trying to allocate memory. Perhaps you ran out
> > of swapspace, or some process has gone berzerk allocating memory.
The memory exhaustion is indeed possible for this box. I'll double check
ulimit and /etc/security/limits.conf stuff. The only thing which worries
me that this box had been running for months without any problems with
2.4.23aa1.
I have added another 2Gb to swap space (hope this give enough time
to find the memory hungry process(es)).
> >
> > How much memory does the machine have, and how much swap space?
> >
> # free
> total used free shared buffers cached
> Mem: 2073868 2067508 6360 0 232708 897828
> -/+ buffers/cache: 936972 1136896
> Swap: 1535976 5228 1530748
>
> > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see
> > what it says prior to the hangs.
> Ok.We'll try to get it next time.
Here it is:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 551920 8108 203744 933532 0 0 4 68 1214 426 5 1 92 2
0 0 551928 7140 203756 930316 0 0 17 61 1240 529 8 1 89 2
0 0 551976 5788 203772 928224 1 6 360 139 1297 317 7 2 83 8
0 0 551968 7588 203812 923504 0 0 19 125 1303 308 8 2 87 4
0 1 551976 10444 203892 914100 0 0 25 127 1433 438 10 3 85 3
0 0 551976 9220 204004 914804 0 0 123 126 1278 325 6 1 88 5
0 0 551976 8108 204044 912248 0 0 38 69 1279 291 6 1 91 2
0 1 551976 11828 204144 912320 1 0 135 94 1249 296 6 1 89 3
0 5 562204 3280 203952 157084 1 566 305 674 1281 313 6 4 73 17
0 18 598224 4276 1888 33356 91 2734 233 2761 1090 199 0 2 0 97
1 38 662520 2760 2104 30520 110 3721 261 3738 1161 831 1 2 0 97
10 41 699936 2772 1920 28716 123 2924 249 2946 1103 1273 0 3 0 97
0 39 748588 2956 1956 22668 160 3313 245 3331 1056 1047 0 2 0 98
0 38 796100 3108 1888 21348 321 3191 430 3206 1045 1002 0 2 0 97
4 43 844532 3308 1956 17644 518 3719 670 3733 1357 999 0 2 0 98
0 51 882596 2940 2052 13960 520 2796 705 2810 1048 1182 0 2 0 98
3 59 913392 2456 2048 10900 1013 2524 1308 2542 1144 601 0 2 0 98
5 71 937816 2760 2072 8584 1534 2681 1860 2702 1234 607 0 2 0 97
>
> > Also capture the sysrq-M output after it
> > has hung.
> >
> This "showmem" && "showreg" have been taken just before
> "SysRq: Show State" from previous message.
>
> SysRq : Show Memory
> Mem-info:
> DMA per-cpu:
> cpu 0 hot: low 2, high 6, batch 1
> cpu 0 cold: low 0, high 2, batch 1
> cpu 1 hot: low 2, high 6, batch 1
> cpu 1 cold: low 0, high 2, batch 1
> Normal per-cpu:
> cpu 0 hot: low 32, high 96, batch 16
> cpu 0 cold: low 0, high 32, batch 16
> cpu 1 hot: low 32, high 96, batch 16
> cpu 1 cold: low 0, high 32, batch 16
> HighMem per-cpu:
> cpu 0 hot: low 32, high 96, batch 16
> cpu 0 cold: low 0, high 32, batch 16
> cpu 1 hot: low 32, high 96, batch 16
> cpu 1 cold: low 0, high 32, batch 16
>
> Free pages: 3172kB (512kB HighMem)
> Active:1783 inactive:87 dirty:0 writeback:0 unstable:0 free:793
> DMA free:1292kB min:16kB low:32kB high:48kB active:3748kB inactive:0kB
> Normal free:1368kB min:936kB low:1872kB high:2808kB active:1368kB
> inactive:356kB
> HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB
> inactive:0kB
> DMA: 151*4kB 70*8kB 6*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
> 0*2048kB 0*4096kB = B
> Normal: 192*4kB 9*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB
> 0*1024kB 0*2048kB 0*4096kB B
> HighMem: 0*4kB 2*8kB 3*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB
> 0*2048kB 0*4096kB =B
> Swap cache: add 1140128, delete 1140063, find 459572/584559, race 145+217
> Free swap: 384364kB
> 524288 pages of RAM
> 294912 pages of HIGHMEM
> 5821 reserved pages
> 976 pages shared
> 65 pages swap cached
>
>
> SysRq : Show Regs
>
> Pid: 0, comm: swapper
> EIP: 0060:[<c0106d1c>] CPU: 0
> EIP is at default_idle+0x2c/0x40
> EFLAGS: 00000246 Not tainted
> EAX: 00000000 EBX: c02e6000 ECX: c0106cf0 EDX: c02e6000
> ESI: c02e6000 EDI: c0105000 EBP: 0008e000 DS: 007b ES: 007b
> CR0: 8005003b CR2: bffff7e0 CR3: 2d021000 CR4: 00000690
> Call Trace:
> [<c0106dab>] cpu_idle+0x3b/0x50
> [<c02e88e9>] start_kernel+0x179/0x1a0
> [<c02e84a0>] unknown_bootoption+0x0/0x120
I forgot to switch output capture on in minicom, so the sysrq-M
was scrolled out of the terminal by subsequent sysrq-T, largest
part of which was in turn scrolled out. But the sysrq-T part
is almost the same as previous one.
>
> > It would be useful to monitor the contents of /proc/vmstat also.
The last /proc/vmstat content is 5 minutes before a real lockup
(It looks like simple "while true; do date; cat /proc/vmstat; sleep 10; done"
script suffer from the same memory exhaustion problem.)
Mon Feb 23 17:41:34 MSK 2004
nr_dirty 136
nr_writeback 0
nr_unstable 0
nr_page_table_pages 987
nr_mapped 227018
nr_slab 13041
pgpgin 8593704
pgpgout 4349808
pswpin 169183
pswpout 183480
pgalloc 20244471
pgfree 20247061
pgactivate 548813
pgdeactivate 628769
pgfault 25756129
pgmajfault 67820
pgscan 4570640
pgrefill 2934423
pgsteal 2024118
pginodesteal 0
kswapd_steal 1886046
kswapd_inodesteal 891
pageoutrun 10047
allocstall 3930
pgrotated 178662
Mon Feb 23 17:41:44 MSK 2004
nr_dirty 339
nr_writeback 0
nr_unstable 0
nr_page_table_pages 991
nr_mapped 226443
nr_slab 13036
pgpgin 8593956
pgpgout 4351080
pswpin 169186
pswpout 183480
pgalloc 20250240
pgfree 20253382
pgactivate 549009
pgdeactivate 628769
pgfault 25764719
pgmajfault 67827
pgscan 4570640
pgrefill 2934423
pgsteal 2024118
pginodesteal 0
kswapd_steal 1886046
kswapd_inodesteal 891
pageoutrun 10047
allocstall 3930
pgrotated 178662
Mon Feb 23 17:41:54 MSK 2004
nr_dirty 505
nr_writeback 0
nr_unstable 0
nr_page_table_pages 993
nr_mapped 226477
nr_slab 13049
pgpgin 8594244
pgpgout 4352144
pswpin 169186
pswpout 183480
pgalloc 20256355
pgfree 20259400
pgactivate 549048
pgdeactivate 628769
pgfault 25772385
pgmajfault 67837
pgscan 4570640
pgrefill 2934423
pgsteal 2024118
pginodesteal 0
kswapd_steal 1886046
kswapd_inodesteal 891
pageoutrun 10047
allocstall 3930
pgrotated 178662
Mon Feb 23 17:42:15 MSK 2004
nr_dirty 0
nr_writeback 765
nr_unstable 0
nr_page_table_pages 1044
nr_mapped 209677
nr_slab 4672
pgpgin 8605592
pgpgout 4424120
pswpin 169454
pswpout 201127
pgalloc 20561829
pgfree 20563033
pgactivate 601317
pgdeactivate 778533
pgfault 25777874
pgmajfault 68001
pgscan 5399589
pgrefill 3543496
pgsteal 2300249
pginodesteal 0
kswapd_steal 2058168
kswapd_inodesteal 14284
pageoutrun 10114
allocstall 7008
pgrotated 193130
Mon Feb 23 17:42:47 MSK 2004
nr_dirty 1
nr_writeback 597
nr_unstable 0
nr_page_table_pages 1213
nr_mapped 190410
nr_slab 4640
pgpgin 8614032
pgpgout 4500108
pswpin 170334
pswpout 219922
pgalloc 20588517
pgfree 20589474
pgactivate 711818
pgdeactivate 908805
pgfault 25783157
pgmajfault 68204
pgscan 5667215
pgrefill 3774369
pgsteal 2322731
pginodesteal 0
kswapd_steal 2066149
kswapd_inodesteal 14352
pageoutrun 10167
allocstall 7383
pgrotated 209403
> >
> > And perhaps keep top running in `sort by memory usage' mode.
> ok, we'll try too.
Unfortunately the top output is kind of useless because mysql
hide the real problem, I'll try to run top in batch mode next time.
top - 17:47:03 up 7:10, 3 users, load average: 124.72, 66.96, 27.71
Tasks: 219 total, 1 running, 218 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2% us, 2.1% sy, 0.0% ni, 0.0% id, 97.6% wa, 0.1% hi, 0.0% si
Mem: 2073868k total, 2070796k used, 3072k free, 1996k buffers
Swap: 1535976k total, 944520k used, 591456k free, 6884k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
896 mysql 15 0 1013m 21m 4896 S 0.1 1.1 0:05.64 mysqld
939 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:00.05 mysqld
940 mysql 20 0 1013m 21m 4896 S 0.0 1.1 0:00.00 mysqld
941 mysql 17 0 1013m 21m 4896 S 0.0 1.1 0:00.00 mysqld
942 mysql 15 0 1013m 21m 4896 D 0.4 1.1 1:26.19 mysqld
943 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:00.00 mysqld
961 mysql 17 0 1013m 21m 4896 D 0.0 1.1 0:00.00 mysqld
962 mysql 15 0 1013m 21m 4896 D 0.0 1.1 0:13.12 mysqld
971 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:00.05 mysqld
972 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:03.12 mysqld
27314 mysql 15 0 1013m 21m 4896 D 0.0 1.1 0:11.73 mysqld
27325 mysql 15 0 1013m 21m 4896 S 0.0 1.1 0:08.70 mysqld
27339 mysql 15 0 1013m 21m 4896 S 0.0 1.1 0:07.78 mysqld
27361 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:10.05 mysqld
27375 mysql 15 0 1013m 21m 4896 S 0.0 1.1 0:10.61 mysqld
27390 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:11.44 mysqld
27392 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:09.20 mysqld
27393 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:12.23 mysqld
3671 mysql 16 0 1013m 21m 4896 D 0.0 1.1 0:00.04 mysqld
3672 mysql 18 0 1013m 21m 4896 D 0.0 1.1 0:00.11 mysqld
3691 mysql 16 0 1013m 21m 4896 D 0.1 1.1 0:00.02 mysqld
3704 mysql 17 0 1013m 21m 4896 S 0.0 1.1 0:00.02 mysqld
Thank you for your help!
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-23 17:27 ` Sergey S. Kostyliov
@ 2004-02-23 21:30 ` Mike Fedyk
2004-02-24 11:56 ` Sergey S. Kostyliov
2004-02-23 22:26 ` Andrew Morton
1 sibling, 1 reply; 24+ messages in thread
From: Mike Fedyk @ 2004-02-23 21:30 UTC (permalink / raw)
To: Sergey S. Kostyliov
Cc: Andrew Morton, linux-kernel, Alexander Y. Fomichev, anton
Sergey S. Kostyliov wrote:
> Hello Andrew,
>
> Now this happens for the third time.
>
>
>>>>I've just reproduced this lockup with 2.6.3.
>>>>
>>>>
>>>>>You may need a serial console to be able to capture all the output.
>>>>>
>>>>>Also, it would be useful to know what sort of load the machines are
>>>>>under, and what filesystems are in use.
>>>>
>>>>The machine is a http server. The main applications are:
>>>>1) apache 1.3 which serves php pages (mod_php):
>>>> 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
>>>> 54 requests currently being processed, 19 idle servers
>>>>2) mysql:
>>>> Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
>>>> Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
>>>>
>>>>This is an IO bound machine in general. All filesystems are reiserfs.
>>>>
>>>>Here is a sysrq-T output obtained from a locked box via serail console:
>>>
>>>OK, so everything is stuck trying to allocate memory. Perhaps you ran out
>>>of swapspace, or some process has gone berzerk allocating memory.
>
>
> The memory exhaustion is indeed possible for this box. I'll double check
> ulimit and /etc/security/limits.conf stuff. The only thing which worries
> me that this box had been running for months without any problems with
> 2.4.23aa1.
>
> I have added another 2Gb to swap space (hope this give enough time
> to find the memory hungry process(es)).
Also check how much memory is being used for slab in /proc/meminfo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-23 17:27 ` Sergey S. Kostyliov
2004-02-23 21:30 ` Mike Fedyk
@ 2004-02-23 22:26 ` Andrew Morton
2004-02-24 7:23 ` Marcelo Tosatti
2004-02-24 11:54 ` Sergey S. Kostyliov
1 sibling, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2004-02-23 22:26 UTC (permalink / raw)
To: Sergey S. Kostyliov; +Cc: linux-kernel, gluk, anton
"Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> > > OK, so everything is stuck trying to allocate memory. Perhaps you ran out
> > > of swapspace, or some process has gone berzerk allocating memory.
>
> The memory exhaustion is indeed possible for this box. I'll double check
> ulimit and /etc/security/limits.conf stuff. The only thing which worries
> me that this box had been running for months without any problems with
> 2.4.23aa1.
It is conceivable that you have some application which runs OK on 2.4.x but
has some subtle bug which causes the app to go crazy on a 2.6 kernel
consuming lots of memory. Or there's a bug in the 2.6 kernel ;)
> I have added another 2Gb to swap space (hope this give enough time
> to find the memory hungry process(es)).
>
> > >
> > > How much memory does the machine have, and how much swap space?
> > >
> > # free
> > total used free shared buffers cached
> > Mem: 2073868 2067508 6360 0 232708 897828
> > -/+ buffers/cache: 936972 1136896
> > Swap: 1535976 5228 1530748
> >
> > > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see
> > > what it says prior to the hangs.
> > Ok.We'll try to get it next time.
>
> Here it is:
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 551920 8108 203744 933532 0 0 4 68 1214 426 5 1 92 2
> 0 0 551928 7140 203756 930316 0 0 17 61 1240 529 8 1 89 2
> 0 0 551976 5788 203772 928224 1 6 360 139 1297 317 7 2 83 8
> 0 0 551968 7588 203812 923504 0 0 19 125 1303 308 8 2 87 4
> 0 1 551976 10444 203892 914100 0 0 25 127 1433 438 10 3 85 3
> 0 0 551976 9220 204004 914804 0 0 123 126 1278 325 6 1 88 5
> 0 0 551976 8108 204044 912248 0 0 38 69 1279 291 6 1 91 2
> 0 1 551976 11828 204144 912320 1 0 135 94 1249 296 6 1 89 3
> 0 5 562204 3280 203952 157084 1 566 305 674 1281 313 6 4 73 17
> 0 18 598224 4276 1888 33356 91 2734 233 2761 1090 199 0 2 0 97
> 1 38 662520 2760 2104 30520 110 3721 261 3738 1161 831 1 2 0 97
> 10 41 699936 2772 1920 28716 123 2924 249 2946 1103 1273 0 3 0 97
> 0 39 748588 2956 1956 22668 160 3313 245 3331 1056 1047 0 2 0 98
> 0 38 796100 3108 1888 21348 321 3191 430 3206 1045 1002 0 2 0 97
> 4 43 844532 3308 1956 17644 518 3719 670 3733 1357 999 0 2 0 98
> 0 51 882596 2940 2052 13960 520 2796 705 2810 1048 1182 0 2 0 98
> 3 59 913392 2456 2048 10900 1013 2524 1308 2542 1144 601 0 2 0 98
> 5 71 937816 2760 2072 8584 1534 2681 1860 2702 1234 607 0 2 0 97
OK, so it's doing a lot of swapping and your swap utilisation is
continuously increasing. I would suspect an application or kernel memory
leak.
I suggest you keep that `vmstat 30' running all the time. When the machine
dies, take a look at the final 20 lines.
Also, run
while true
do
cat /proc/meminfo
sleep 10
done
and record the info which that leaves behind when the machine locks up.
This should tell us whether it is an application or kernel memory leak. If
it is indeed a leak.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-24 7:23 ` Marcelo Tosatti
@ 2004-02-24 6:53 ` Andrew Morton
0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2004-02-24 6:53 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: rathamahata, linux-kernel, gluk, anton
Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
>
> > Also, run
> >
> > while true
> > do
> > cat /proc/meminfo
> > sleep 10
> > done
> >
> > and record the info which that leaves behind when the machine locks up.
> > This should tell us whether it is an application or kernel memory leak. If
> > it is indeed a leak.
>
> Hi Andrew,
>
> Care to explain me why should the kernel hang if due to an application
> leak ?
It shouldn't - the oom killer should have done something. But we'll
address that once we've confirmed that something really is leaking.
> The hang looks wrong even if the leak is in userspace app, yes?
Probably, yes.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-23 22:26 ` Andrew Morton
@ 2004-02-24 7:23 ` Marcelo Tosatti
2004-02-24 6:53 ` Andrew Morton
2004-02-24 11:54 ` Sergey S. Kostyliov
1 sibling, 1 reply; 24+ messages in thread
From: Marcelo Tosatti @ 2004-02-24 7:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: Sergey S. Kostyliov, linux-kernel, gluk, anton
On Mon, 23 Feb 2004, Andrew Morton wrote:
> "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> >
> > > > OK, so everything is stuck trying to allocate memory. Perhaps you ran out
> > > > of swapspace, or some process has gone berzerk allocating memory.
> >
> > The memory exhaustion is indeed possible for this box. I'll double check
> > ulimit and /etc/security/limits.conf stuff. The only thing which worries
> > me that this box had been running for months without any problems with
> > 2.4.23aa1.
>
> It is conceivable that you have some application which runs OK on 2.4.x but
> has some subtle bug which causes the app to go crazy on a 2.6 kernel
> consuming lots of memory. Or there's a bug in the 2.6 kernel ;)
>
> > I have added another 2Gb to swap space (hope this give enough time
> > to find the memory hungry process(es)).
> >
> > > >
> > > > How much memory does the machine have, and how much swap space?
> > > >
> > > # free
> > > total used free shared buffers cached
> > > Mem: 2073868 2067508 6360 0 232708 897828
> > > -/+ buffers/cache: 936972 1136896
> > > Swap: 1535976 5228 1530748
> > >
> > > > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see
> > > > what it says prior to the hangs.
> > > Ok.We'll try to get it next time.
> >
> > Here it is:
> > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> > r b swpd free buff cache si so bi bo in cs us sy id wa
> > 1 0 551920 8108 203744 933532 0 0 4 68 1214 426 5 1 92 2
> > 0 0 551928 7140 203756 930316 0 0 17 61 1240 529 8 1 89 2
> > 0 0 551976 5788 203772 928224 1 6 360 139 1297 317 7 2 83 8
> > 0 0 551968 7588 203812 923504 0 0 19 125 1303 308 8 2 87 4
> > 0 1 551976 10444 203892 914100 0 0 25 127 1433 438 10 3 85 3
> > 0 0 551976 9220 204004 914804 0 0 123 126 1278 325 6 1 88 5
> > 0 0 551976 8108 204044 912248 0 0 38 69 1279 291 6 1 91 2
> > 0 1 551976 11828 204144 912320 1 0 135 94 1249 296 6 1 89 3
> > 0 5 562204 3280 203952 157084 1 566 305 674 1281 313 6 4 73 17
> > 0 18 598224 4276 1888 33356 91 2734 233 2761 1090 199 0 2 0 97
> > 1 38 662520 2760 2104 30520 110 3721 261 3738 1161 831 1 2 0 97
> > 10 41 699936 2772 1920 28716 123 2924 249 2946 1103 1273 0 3 0 97
> > 0 39 748588 2956 1956 22668 160 3313 245 3331 1056 1047 0 2 0 98
> > 0 38 796100 3108 1888 21348 321 3191 430 3206 1045 1002 0 2 0 97
> > 4 43 844532 3308 1956 17644 518 3719 670 3733 1357 999 0 2 0 98
> > 0 51 882596 2940 2052 13960 520 2796 705 2810 1048 1182 0 2 0 98
> > 3 59 913392 2456 2048 10900 1013 2524 1308 2542 1144 601 0 2 0 98
> > 5 71 937816 2760 2072 8584 1534 2681 1860 2702 1234 607 0 2 0 97
>
> OK, so it's doing a lot of swapping and your swap utilisation is
> continuously increasing. I would suspect an application or kernel memory
> leak.
>
> I suggest you keep that `vmstat 30' running all the time. When the machine
> dies, take a look at the final 20 lines.
>
> Also, run
>
> while true
> do
> cat /proc/meminfo
> sleep 10
> done
>
> and record the info which that leaves behind when the machine locks up.
> This should tell us whether it is an application or kernel memory leak. If
> it is indeed a leak.
Hi Andrew,
Care to explain me why should the kernel hang if due to an application
leak ?
The hang looks wrong even if the leak is in userspace app, yes?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-23 22:26 ` Andrew Morton
2004-02-24 7:23 ` Marcelo Tosatti
@ 2004-02-24 11:54 ` Sergey S. Kostyliov
2004-02-26 12:19 ` Sergey S. Kostyliov
1 sibling, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-24 11:54 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, gluk, anton
On Tuesday 24 February 2004 01:26, Andrew Morton wrote:
> "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
<cut>
> > The memory exhaustion is indeed possible for this box. I'll double check
> > ulimit and /etc/security/limits.conf stuff. The only thing which worries
> > me that this box had been running for months without any problems with
> > 2.4.23aa1.
>
> It is conceivable that you have some application which runs OK on 2.4.x but
> has some subtle bug which causes the app to go crazy on a 2.6 kernel
> consuming lots of memory. Or there's a bug in the 2.6 kernel ;)
>
> > I have added another 2Gb to swap space (hope this give enough time
> > to find the memory hungry process(es)).
<cut>
>
> OK, so it's doing a lot of swapping and your swap utilisation is
> continuously increasing. I would suspect an application or kernel memory
> leak.
>
> I suggest you keep that `vmstat 30' running all the time. When the machine
> dies, take a look at the final 20 lines.
Here is from the last lockup:
1) last 20 entries of the `vmstat 30':
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 1 116676 7752 266156 621360 8 1 1031 186 1364 444 53 5 30 12
1 0 116656 7512 266316 617716 2 3 334 79 1355 334 59 4 34 3
1 0 116240 8072 266800 616444 17 1 539 302 1397 464 59 7 29 6
1 0 116216 13320 266948 614044 1 1 1229 92 1505 587 61 6 27 6
2 0 116208 8344 267152 618048 1 0 436 143 1367 386 58 5 32 5
1 1 116216 6024 267308 619188 0 59 4574 164 1554 742 61 6 20 12
1 1 116284 6468 267736 614028 4 2 1087 117 1458 529 60 7 27 6
1 0 116280 6336 267888 617860 1 0 1225 101 1419 542 59 6 30 6
2 1 116472 7264 268148 619288 0 4 7788 100 1645 950 33 6 29 33
1 1 116728 5976 268296 617112 0 7 7799 86 1566 815 30 6 32 32
2 0 116752 6080 268488 615992 6 8 7434 136 1627 910 34 7 25 34
0 1 116944 6368 268588 615420 1 4 7601 95 1696 952 39 6 25 30
1 0 116968 30600 268896 585832 0 4 2212 176 1584 642 62 7 16 15
0 1 116968 6128 269064 604912 0 0 1410 67 1460 532 60 5 29 6
1 0 116964 6280 269308 604008 0 4 7449 106 1561 819 35 5 30 30
0 1 116976 6080 269400 603208 1 0 7317 121 1535 762 31 6 31 32
1 16 331784 4452 2488 25132 30 7369 1916 7441 1177 333 7 6 6 81
1 26 627540 3116 2172 23156 134 10159 217 10173 1159 200 0 4 0 96
5 29 884564 3144 2036 16032 468 9443 622 9471 1106 435 0 5 0 95
0 50 1097880 2800 2108 8592 484 7141 794 7164 1119 831 0 6 0 94
2) sysrq-M (This one looks strange to me because of
"Free swap: 2326708kB")
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Free pages: 2136kB (512kB HighMem)
Active:832 inactive:103 dirty:0 writeback:0 unstable:0 free:534
DMA free:256kB min:16kB low:32kB high:48kB active:0kB inactive:0kB
Normal free:1368kB min:936kB low:1872kB high:2808kB active:1380kB inactive:352kB
HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB inactive:0kB
DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 256kB
Normal: 170*4kB 10*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1368kB
HighMem: 8*4kB 0*8kB 2*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB
Swap cache: add 392088, delete 392033, find 22279/32705, race 16+47
Free swap: 2326708kB
524288 pages of RAM
294912 pages of HIGHMEM
5821 reserved pages
860 pages shared
55 pages swap cached
3) sysrq-T:
http://sysadminday.org.ru/2.6.3-io_lockup/ope/sysrq-T
4) 3 last copies of /proc/vmstat
Tue Feb 24 02:36:53 MSK 2004
nr_dirty 320
nr_writeback 0
nr_unstable 0
nr_page_table_pages 822
nr_mapped 289207
nr_slab 11709
pgpgin 15829228
pgpgout 18320340
pswpin 25882
pswpout 37006
pgalloc 28844087
pgfree 28845931
pgactivate 923552
pgdeactivate 760039
pgfault 25500106
pgmajfault 66503
pgscan 7611061
pgrefill 4989936
pgsteal 5628844
pginodesteal 0
kswapd_steal 5211828
kswapd_inodesteal 2958
pageoutrun 33148
allocstall 12799
pgrotated 205322
Tue Feb 24 02:37:03 MSK 2004
nr_dirty 566
nr_writeback 0
nr_unstable 0
nr_page_table_pages 823
nr_mapped 289174
nr_slab 11733
pgpgin 15917192
pgpgout 18321888
pswpin 25882
pswpout 37006
pgalloc 28886326
pgfree 28888201
pgactivate 923806
pgdeactivate 760254
pgfault 25519499
pgmajfault 66550
pgscan 7633363
pgrefill 5008883
pgsteal 5650891
pginodesteal 0
kswapd_steal 5233875
kswapd_inodesteal 2958
pageoutrun 33287
allocstall 12799
pgrotated 205322
Tue Feb 24 02:37:23 MSK 2004
nr_dirty 4
nr_writeback 4559
nr_unstable 0
nr_page_table_pages 962
nr_mapped 197703
nr_slab 4887
pgpgin 15935652
pgpgout 18698124
pswpin 26444
pswpout 130749
pgalloc 29203531
pgfree 29204764
pgactivate 927401
pgdeactivate 944643
pgfault 25525960
pgmajfault 66694
pgscan 9534651
pgrefill 6027760
pgsteal 5952333
pginodesteal 0
kswapd_steal 5421086
kswapd_inodesteal 4181
pageoutrun 33500
allocstall 16189
pgrotated 292969
Tue Feb 24 02:38:16 MSK 2004
nr_dirty 0
nr_writeback 1805
nr_unstable 0
nr_page_table_pages 1433
nr_mapped 102046
nr_slab 4782
pgpgin 15956340
pgpgout 19099784
pswpin 30206
pswpout 230912
pgalloc 29315002
pgfree 29316033
pgactivate 1082560
pgdeactivate 1202414
pgfault 25537663
pgmajfault 67369
pgscan 11280124
pgrefill 6802507
pgsteal 6058697
pginodesteal 0
kswapd_steal 5476702
kswapd_inodesteal 4257
pageoutrun 33668
allocstall 17610
pgrotated 391878
4) Full top output:
top - 02:39:00 up 8:22, 3 users, load average: 76.16, 25.71, 10.41
Tasks: 225 total, 1 running, 224 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.4% us, 5.3% sy, 0.0% ni, 0.2% id, 93.9% wa, 0.2% hi, 0.0% si
Mem: 2073868k total, 2071260k used, 2608k free, 2104k buffers
Swap: 3583968k total, 1097884k used, 2486084k free, 8604k cached
25123 mysql 15 0 1002m 142m 4896 D 1.0 7.0 10:35.25 mysqld
25122 mysql 15 0 1002m 142m 4896 D 0.0 7.0 0:05.91 mysqld
24132 mysql 15 0 1002m 142m 4896 D 0.0 7.0 0:28.97 mysqld
24129 mysql 15 0 1002m 142m 4896 S 0.0 7.0 0:05.90 mysqld
24125 mysql 15 0 1002m 142m 4896 D 0.1 7.0 0:07.59 mysqld
5420 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:50.44 mysqld
4748 mysql 15 0 1002m 142m 4896 D 0.0 7.0 3:10.94 mysqld
4746 mysql 15 0 1002m 142m 4896 S 0.0 7.0 2:52.37 mysqld
970 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:19.57 mysqld
969 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:17.52 mysqld
968 mysql 15 0 1002m 142m 4896 D 0.1 7.0 0:15.47 mysqld
967 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:00.00 mysqld
958 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:15.64 mysqld
957 mysql 15 0 1002m 142m 4896 S 0.0 7.0 2:17.52 mysqld
956 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:00.16 mysqld
955 mysql 17 0 1002m 142m 4896 S 0.0 7.0 0:00.00 mysqld
954 mysql 15 0 1002m 142m 4896 S 0.0 7.0 0:00.01 mysqld
898 mysql 15 0 1002m 142m 4896 D 0.0 7.0 0:03.57 mysqld
30132 pricemat 25 0 88976 12m 1944 S 0.0 0.6 2:29.37 make_words
29381 apache 15 0 57948 3200 41m D 0.0 0.2 0:26.14 httpd
29652 apache 15 0 57920 3188 41m S 0.0 0.2 0:19.16 httpd
31015 apache 15 0 56456 2484 41m D 0.0 0.1 0:14.68 httpd
29155 apache 15 0 55064 2916 42m S 0.0 0.1 0:21.33 httpd
30281 apache 15 0 54756 5096 41m D 0.0 0.2 0:11.47 httpd
29638 apache 16 0 54744 3816 42m S 0.0 0.2 0:17.62 httpd
29540 apache 15 0 54436 4732 41m D 0.0 0.2 0:19.75 httpd
30153 apache 15 0 54404 3472 41m S 0.0 0.2 0:12.79 httpd
30123 apache 15 0 54356 4024 41m D 0.0 0.2 0:13.18 httpd
30116 apache 15 0 54316 3352 41m D 0.0 0.2 0:11.75 httpd
29647 apache 15 0 54308 4224 41m D 0.0 0.2 0:17.31 httpd
30134 apache 15 0 53416 2968 41m D 0.0 0.1 0:14.14 httpd
29651 apache 15 0 53040 3220 41m D 0.0 0.2 0:17.58 httpd
29013 apache 15 0 52888 4552 41m S 0.0 0.2 0:13.12 httpd
30619 apache 15 0 52824 3584 41m D 0.0 0.2 0:05.70 httpd
28174 apache 15 0 52692 3956 41m D 0.0 0.2 0:17.85 httpd
30926 apache 15 0 52572 2960 41m S 0.0 0.1 0:04.82 httpd
30117 apache 15 0 52464 4356 41m D 0.1 0.2 0:12.74 httpd
30135 apache 15 0 52392 3984 41m D 0.0 0.2 0:11.73 httpd
30126 apache 15 0 52380 4076 41m D 0.0 0.2 0:13.88 httpd
30133 apache 15 0 52340 2856 41m D 0.0 0.1 0:13.50 httpd
31136 apache 15 0 52316 2596 41m D 0.0 0.1 0:00.90 httpd
30127 apache 15 0 52312 3044 41m D 0.0 0.1 0:12.13 httpd
30136 apache 15 0 52208 2780 41m D 0.0 0.1 0:13.56 httpd
31138 apache 15 0 52116 3272 41m D 0.1 0.2 0:00.78 httpd
31137 apache 15 0 52068 2420 41m D 0.0 0.1 0:00.99 httpd
31289 apache 16 0 51476 1900 41m D 0.0 0.1 0:00.00 httpd
31273 apache 17 0 51360 2188 41m D 0.0 0.1 0:00.01 httpd
31261 apache 16 0 51252 1740 41m D 0.0 0.1 0:00.02 httpd
31234 apache 16 0 51220 1520 41m D 0.1 0.1 0:00.02 httpd
31208 apache 16 0 51220 1888 41m D 0.0 0.1 0:00.02 httpd
31276 apache 16 0 51144 1920 41m D 0.0 0.1 0:00.00 httpd
31274 apache 16 0 51144 1952 41m D 0.0 0.1 0:00.00 httpd
31258 apache 18 0 51144 2068 41m D 0.0 0.1 0:00.02 httpd
31255 apache 18 0 51144 2068 41m D 0.0 0.1 0:00.01 httpd
31254 apache 16 0 51144 2012 41m D 0.0 0.1 0:00.01 httpd
31252 apache 17 0 51144 1996 41m D 0.0 0.1 0:00.01 httpd
31251 apache 16 0 51144 2012 41m D 0.0 0.1 0:00.01 httpd
31238 apache 16 0 51144 2068 41m D 0.0 0.1 0:00.01 httpd
31212 apache 17 0 51144 2028 41m D 0.0 0.1 0:00.01 httpd
31288 apache 17 0 51140 2056 41m D 0.0 0.1 0:00.01 httpd
31287 apache 16 0 51140 2020 41m D 0.0 0.1 0:00.00 httpd
31227 apache 18 0 51140 2084 41m D 0.0 0.1 0:00.00 httpd
31201 apache 16 0 51140 2024 41m D 0.0 0.1 0:00.01 httpd
31225 apache 16 0 51136 1768 41m D 0.0 0.1 0:00.01 httpd
31300 apache 16 0 51132 1700 41m D 0.0 0.1 0:00.00 httpd
31285 apache 16 0 51132 2112 41m D 0.0 0.1 0:00.00 httpd
31283 apache 16 0 51132 1708 41m D 0.0 0.1 0:00.00 httpd
31280 apache 16 0 51132 1692 41m D 0.0 0.1 0:00.00 httpd
31272 apache 18 0 51132 1828 41m D 0.0 0.1 0:00.00 httpd
31257 apache 16 0 51132 2012 41m D 0.0 0.1 0:00.00 httpd
31207 apache 16 0 51132 1708 41m D 0.0 0.1 0:00.00 httpd
31243 apache 16 0 51128 1856 41m D 0.0 0.1 0:00.00 httpd
31296 apache 16 0 51120 1844 41m D 0.0 0.1 0:00.00 httpd
31295 apache 16 0 51120 1632 41m D 0.0 0.1 0:00.00 httpd
31284 apache 16 0 51120 1640 41m D 0.0 0.1 0:00.01 httpd
31277 apache 16 0 51120 1616 41m D 0.0 0.1 0:00.00 httpd
31271 apache 18 0 51120 1656 41m D 0.0 0.1 0:00.00 httpd
31220 apache 16 0 51120 1620 41m D 0.0 0.1 0:00.01 httpd
31206 apache 16 0 51108 1944 41m D 0.0 0.1 0:00.01 httpd
31249 apache 17 0 51104 1788 41m D 0.0 0.1 0:00.01 httpd
31237 apache 16 0 51104 1848 41m D 0.1 0.1 0:00.02 httpd
31253 apache 16 0 51100 2140 41m D 0.0 0.1 0:00.01 httpd
31203 apache 17 0 51100 1608 41m D 0.0 0.1 0:00.01 httpd
31211 apache 16 0 51096 2004 41m D 0.0 0.1 0:00.01 httpd
31298 apache 17 0 51092 2004 41m D 0.0 0.1 0:00.00 httpd
31282 apache 16 0 51092 2084 41m D 0.0 0.1 0:00.00 httpd
31267 apache 18 0 51092 2056 41m D 0.0 0.1 0:00.01 httpd
31313 apache 18 0 51088 1512 41m D 0.0 0.1 0:00.00 httpd
31312 apache 16 0 51088 1508 41m D 0.0 0.1 0:00.00 httpd
31310 apache 17 0 51088 1512 41m D 0.0 0.1 0:00.00 httpd
31286 apache 16 0 51088 1680 41m D 0.0 0.1 0:00.01 httpd
31281 apache 15 0 51088 1268 41m D 0.0 0.1 0:00.00 httpd
31269 apache 18 0 51088 1824 41m D 0.0 0.1 0:00.00 httpd
31268 apache 17 0 51088 1776 41m D 0.1 0.1 0:00.04 httpd
31248 apache 16 0 51088 1600 41m S 0.1 0.1 0:00.02 httpd
31242 apache 16 0 51088 1336 41m D 0.0 0.1 0:00.00 httpd
31241 apache 15 0 51088 1636 41m S 0.0 0.1 0:00.01 httpd
31236 apache 18 0 51088 1752 41m D 0.0 0.1 0:00.00 httpd
31233 apache 16 0 51088 1376 41m S 0.0 0.1 0:00.00 httpd
31231 apache 16 0 51088 1196 41m D 0.0 0.1 0:00.00 httpd
31217 apache 16 0 51088 1636 41m S 0.0 0.1 0:00.01 httpd
31214 apache 16 0 51088 1428 41m S 0.0 0.1 0:00.01 httpd
31210 apache 16 0 51088 1320 41m S 0.0 0.1 0:00.00 httpd
31205 apache 18 0 51088 1648 41m D 0.0 0.1 0:00.01 httpd
31204 apache 16 0 51088 1268 41m S 0.0 0.1 0:00.00 httpd
31235 apache 16 0 51080 1364 41m D 0.0 0.1 0:00.00 httpd
31232 apache 16 0 51080 1484 41m D 0.0 0.1 0:00.03 httpd
31219 apache 18 0 51080 1800 41m D 0.0 0.1 0:00.01 httpd
31315 apache 18 0 51076 1384 41m D 0.0 0.1 0:00.00 httpd
31314 apache 16 0 51076 1316 41m D 0.0 0.1 0:00.00 httpd
31311 apache 18 0 51076 1464 41m D 0.0 0.1 0:00.01 httpd
31309 apache 18 0 51076 1384 41m D 0.0 0.1 0:00.00 httpd
31308 apache 18 0 51076 1304 41m D 0.0 0.1 0:00.00 httpd
31306 apache 17 0 51076 1420 41m D 0.0 0.1 0:00.00 httpd
31305 apache 18 0 51076 1320 41m D 0.0 0.1 0:00.01 httpd
31304 apache 18 0 51076 1280 41m D 0.0 0.1 0:00.00 httpd
31303 apache 18 0 51076 1380 41m D 0.0 0.1 0:00.00 httpd
31302 apache 17 0 51076 1308 41m D 0.0 0.1 0:00.00 httpd
31301 apache 15 0 51076 1292 41m D 0.0 0.1 0:00.00 httpd
31297 apache 16 0 51076 1348 41m D 0.0 0.1 0:00.00 httpd
31279 apache 16 0 51076 1292 41m S 0.0 0.1 0:00.00 httpd
31278 apache 15 0 51076 1204 41m D 0.0 0.1 0:00.00 httpd
31275 apache 15 0 51076 1196 41m D 0.0 0.1 0:00.00 httpd
31260 apache 16 0 51076 1548 41m S 0.0 0.1 0:00.02 httpd
31259 apache 18 0 51076 1536 41m S 0.0 0.1 0:00.00 httpd
31256 apache 18 0 51076 1444 41m S 0.0 0.1 0:00.00 httpd
31250 apache 16 0 51076 1484 41m S 0.0 0.1 0:00.00 httpd
31247 apache 16 0 51076 1292 41m D 0.0 0.1 0:00.01 httpd
31246 apache 16 0 51076 1296 41m S 0.0 0.1 0:00.01 httpd
31245 apache 18 0 51076 1172 41m D 0.0 0.1 0:00.00 httpd
31244 apache 15 0 51076 1412 41m S 0.0 0.1 0:00.00 httpd
31240 apache 16 0 51076 1500 41m S 0.0 0.1 0:00.01 httpd
31239 apache 15 0 51076 1548 41m D 0.0 0.1 0:00.01 httpd
31230 apache 18 0 51076 1300 41m D 0.0 0.1 0:00.00 httpd
31229 apache 18 0 51076 1304 41m D 0.0 0.1 0:00.00 httpd
31228 apache 16 0 51076 1424 41m S 0.0 0.1 0:00.00 httpd
31226 apache 16 0 51076 1760 41m D 0.0 0.1 0:00.01 httpd
31223 apache 18 0 51076 1216 41m D 0.0 0.1 0:00.00 httpd
31218 apache 18 0 51076 1704 41m D 0.0 0.1 0:00.01 httpd
31216 apache 18 0 51076 1208 41m D 0.0 0.1 0:00.00 httpd
31215 apache 16 0 51076 1240 41m D 0.0 0.1 0:00.00 httpd
31202 apache 17 0 51076 1620 41m D 0.0 0.1 0:00.01 httpd
31325 root 17 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd
31324 root 17 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd
31323 root 15 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd
31322 root 15 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd
31319 root 18 0 51064 1288 41m D 0.0 0.1 0:00.00 httpd
31318 root 17 0 51064 1312 41m D 0.0 0.1 0:00.00 httpd
31316 root 18 0 51064 1328 41m D 0.0 0.1 0:00.00 httpd
794 root 17 0 51064 1192 41m S 0.1 0.1 0:01.67 httpd
23885 pricemat 16 0 5652 1124 4892 S 0.0 0.1 0:00.02 php
23980 pricemat 17 0 5648 652 4892 S 0.0 0.0 0:00.01 php
1430 root 15 0 3780 260 3112 S 0.0 0.0 0:00.61 sshd
8273 root 15 0 3716 468 3112 S 0.0 0.0 0:01.26 sshd
994 root 16 0 3660 516 3112 S 0.0 0.0 0:10.24 sshd
2147 root 15 0 3572 176 3112 S 0.0 0.0 0:00.12 sshd
2129 root 16 0 3572 156 3112 S 0.0 0.0 0:00.76 sshd
1919 root 15 0 3532 128 3112 S 0.0 0.0 0:00.11 sshd
1480 root 16 0 3488 84 2224 S 0.0 0.0 0:00.83 bash
2991 rathamah 16 0 3336 420 2052 S 0.0 0.0 0:00.04 bash
1431 rathamah 16 0 2828 880 2052 S 0.0 0.0 0:00.04 bash
770 dnscache 15 0 2712 24 1412 S 0.0 0.0 0:17.59 dnscache
728 root 16 0 2672 176 2560 S 0.0 0.0 0:01.72 sshd
1001 rathamah 16 0 2588 48 2052 S 0.0 0.0 0:00.03 bash
2957 root 17 0 2388 40 1984 S 0.0 0.0 0:00.02 login
846 root 20 0 2284 252 2120 S 0.0 0.0 0:00.02 mysqld_safe
750 root 16 0 2212 44 1900 S 0.0 0.0 0:00.00 xinetd
1478 root 16 0 2148 216 1788 S 0.0 0.0 0:00.00 su
1062 rathamah 15 0 2132 508 1728 D 0.4 0.0 2:37.73 top
8278 rathamah 15 0 2028 560 1728 R 0.2 0.0 0:18.48 top
31292 mobilius 18 0 1972 124 1884 S 0.0 0.0 0:00.00 lacheck.sh
31263 mobilius 18 0 1972 112 1884 S 0.0 0.0 0:00.01 sh
2131 rathamah 15 0 1964 80 1884 S 0.0 0.0 0:04.17 proc_vmstat.sh
1192 apache 16 0 1952 44 1816 S 0.0 0.0 0:00.60 cache_clean
708 ntp 16 0 1936 1928 1792 S 0.0 0.1 0:00.10 ntpd
619 root 15 0 1840 304 1624 S 0.0 0.0 0:06.43 syslogd
837 root 15 0 1728 136 1536 S 0.0 0.0 0:00.14 crond
23884 root 16 0 1620 160 1536 S 0.0 0.0 0:00.00 crond
31264 root 18 0 1616 108 1536 S 0.0 0.0 0:00.00 crond
31262 root 17 0 1616 96 1536 S 0.0 0.0 0:00.00 crond
1088 root 23 0 1616 84 1536 S 0.0 0.0 0:00.00 crond
625 root 16 0 1532 188 1364 S 0.0 0.0 0:00.19 klogd
2149 rathamah 15 0 1464 88 1404 S 0.0 0.0 0:00.02 vmstat
863 qmails 15 0 1444 196 1388 S 0.0 0.0 0:45.57 qmail-send
31321 megashop 18 0 1436 200 1388 D 0.0 0.0 0:00.00 qmail-inject
31293 megashop 18 0 1436 200 1388 D 0.0 0.0 0:00.01 qmail-inject
23939 pricemat 16 0 1436 8 1388 S 0.0 0.0 0:00.00 qmail-inject
23 root 17 0 1436 432 1392 S 0.0 0.0 0:00.66 devfsd
31317 rathamah 17 0 1432 124 1420 D 0.0 0.0 0:00.01 date
1201 urs 15 0 1420 100 1392 D 0.0 0.0 0:00.54 tcpserver
1187 urs 18 0 1420 44 1392 S 0.0 0.0 0:00.00 tcpserver
767 root 15 0 1420 76 1356 S 0.0 0.0 0:00.12 svscan
1 root 15 0 1420 424 1372 D 0.0 0.0 0:04.26 init
866 qmaill 15 0 1412 292 1352 S 0.0 0.0 0:00.62 splogger
867 root 15 0 1404 96 1360 S 0.0 0.0 0:00.15 qmail-lspawn
868 qmailr 16 0 1400 96 1356 S 0.0 0.0 0:00.17 qmail-rspawn
771 dnslog 15 0 1400 16 1368 S 0.0 0.0 0:10.64 multilog
1261 root 16 0 1392 40 1352 S 0.0 0.0 0:00.00 mingetty
919 root 16 0 1392 36 1352 S 0.0 0.0 0:00.00 mingetty
916 root 16 0 1392 92 1352 S 0.0 0.0 0:00.00 mingetty
915 root 16 0 1392 64 1352 S 0.0 0.0 0:00.00 mingetty
914 root 16 0 1392 80 1352 S 0.0 0.0 0:00.00 mingetty
913 root 16 0 1392 76 1352 S 0.0 0.0 0:00.00 mingetty
869 qmailq 15 0 1388 84 1356 S 0.0 0.0 0:00.66 qmail-clean
769 root 16 0 1388 12 1360 S 0.0 0.0 0:00.00 supervise
768 root 16 0 1388 12 1360 S 0.0 0.0 0:00.00 supervise
31307 mobilius 18 0 376 116 348 D 0.0 0.0 0:00.00 awk
31320 root 15 0 0 0 0 D 0.0 0.0 0:00.00 pdflush
31222 root 15 0 0 0 0 D 0.0 0.0 0:00.01 pdflush
24026 root 15 0 0 0 0 D 0.0 0.0 0:05.24 pdflush
18 root 5 -10 0 0 0 S 0.0 0.0 0:00.04 reiserfs/1
17 root 5 -10 0 0 0 S 0.0 0.0 0:00.02 reiserfs/0
16 root 18 0 0 0 0 S 0.0 0.0 0:00.15 kseriod
15 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 aio/1
14 root 10 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0
13 root 15 0 0 0 0 D 8.9 0.0 0:23.43 kswapd0
10 root 15 0 0 0 0 S 0.0 0.0 0:00.00 kirqd
9 root 5 -10 0 0 0 S 0.0 0.0 0:00.01 kblockd/1
8 root 5 -10 0 0 0 S 0.0 0.0 0:00.01 kblockd/0
7 root 5 -10 0 0 0 S 0.0 0.0 0:00.02 events/1
6 root 5 -10 0 0 0 S 0.0 0.0 0:00.03 events/0
5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
>
> Also, run
>
> while true
> do
> cat /proc/meminfo
> sleep 10
> done
>
> and record the info which that leaves behind when the machine locks up.
> This should tell us whether it is an application or kernel memory leak. If
> it is indeed a leak.
Will do this next time.
>
>
>
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-23 21:30 ` Mike Fedyk
@ 2004-02-24 11:56 ` Sergey S. Kostyliov
0 siblings, 0 replies; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-24 11:56 UTC (permalink / raw)
To: Mike Fedyk; +Cc: Andrew Morton, linux-kernel, Alexander Y. Fomichev, anton
On Tuesday 24 February 2004 00:30, Mike Fedyk wrote:
> Sergey S. Kostyliov wrote:
> > Hello Andrew,
> >
> > Now this happens for the third time.
> >
> >
> >>>>I've just reproduced this lockup with 2.6.3.
> >>>>
> >>>>
> >>>>>You may need a serial console to be able to capture all the output.
> >>>>>
> >>>>>Also, it would be useful to know what sort of load the machines are
> >>>>>under, and what filesystems are in use.
> >>>>
> >>>>The machine is a http server. The main applications are:
> >>>>1) apache 1.3 which serves php pages (mod_php):
> >>>> 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request
> >>>> 54 requests currently being processed, 19 idle servers
> >>>>2) mysql:
> >>>> Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980
> >>>> Flush tables: 1 Open tables: 630 Queries per second avg: 143.547
> >>>>
> >>>>This is an IO bound machine in general. All filesystems are reiserfs.
> >>>>
> >>>>Here is a sysrq-T output obtained from a locked box via serail console:
> >>>
> >>>OK, so everything is stuck trying to allocate memory. Perhaps you ran out
> >>>of swapspace, or some process has gone berzerk allocating memory.
> >
> >
> > The memory exhaustion is indeed possible for this box. I'll double check
> > ulimit and /etc/security/limits.conf stuff. The only thing which worries
> > me that this box had been running for months without any problems with
> > 2.4.23aa1.
> >
> > I have added another 2Gb to swap space (hope this give enough time
> > to find the memory hungry process(es)).
>
> Also check how much memory is being used for slab in /proc/meminfo
Thanks for the hint, will do this next time.
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-24 11:54 ` Sergey S. Kostyliov
@ 2004-02-26 12:19 ` Sergey S. Kostyliov
2004-02-26 12:53 ` Andrew Morton
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-26 12:19 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, gluk, anton, Mike Fedyk
On Tuesday 24 February 2004 14:54, Sergey S. Kostyliov wrote:
> On Tuesday 24 February 2004 01:26, Andrew Morton wrote:
> > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> <cut>
>
> > > The memory exhaustion is indeed possible for this box. I'll double check
> > > ulimit and /etc/security/limits.conf stuff. The only thing which worries
> > > me that this box had been running for months without any problems with
> > > 2.4.23aa1.
> >
> > It is conceivable that you have some application which runs OK on 2.4.x but
> > has some subtle bug which causes the app to go crazy on a 2.6 kernel
> > consuming lots of memory. Or there's a bug in the 2.6 kernel ;)
> >
> > > I have added another 2Gb to swap space (hope this give enough time
> > > to find the memory hungry process(es)).
>
> <cut>
>
> >
> > OK, so it's doing a lot of swapping and your swap utilisation is
> > continuously increasing. I would suspect an application or kernel memory
> > leak.
> >
> > I suggest you keep that `vmstat 30' running all the time. When the machine
> > dies, take a look at the final 20 lines.
>
> Here is from the last lockup:
Yet another lockup has just occurred. I could be wrong but from the
/proc/meminfo content it doesn't looks like memory leak (neither kernel
nor userspace), doesn't it?
1) 3 last /proc/meminfo before a hang:
===============================
Thu Feb 26 04:58:34 MSK 2004
MemTotal: 2073868 kB
MemFree: 7008 kB
Buffers: 223100 kB
Cached: 593368 kB
SwapCached: 748824 kB
Active: 1776280 kB
Inactive: 226160 kB
HighTotal: 1179648 kB
HighFree: 2560 kB
LowTotal: 894220 kB
LowFree: 4448 kB
SwapTotal: 3583968 kB
SwapFree: 2675616 kB
Dirty: 2156 kB
Writeback: 0 kB
Mapped: 1219740 kB
Slab: 43668 kB
Committed_AS: 1846968 kB
PageTables: 4020 kB
VmallocTotal: 114680 kB
VmallocUsed: 7448 kB
VmallocChunk: 107232 kB
Thu Feb 26 04:59:05 MSK 2004
MemTotal: 2073868 kB
MemFree: 3972 kB
Buffers: 2268 kB
Cached: 36132 kB
SwapCached: 726940 kB
Active: 1157256 kB
Inactive: 3696 kB
HighTotal: 1179648 kB
HighFree: 704 kB
LowTotal: 894220 kB
LowFree: 3268 kB
SwapTotal: 3583968 kB
SwapFree: 2633444 kB
Dirty: 20 kB
Writeback: 3376 kB
Mapped: 1154812 kB
Slab: 27996 kB
Committed_AS: 1851456 kB
PageTables: 4052 kB
VmallocTotal: 114680 kB
VmallocUsed: 7448 kB
VmallocChunk: 107232 kB
Thu Feb 26 05:00:15 MSK 2004
MemTotal: 2073868 kB
MemFree: 2528 kB
Buffers: 2180 kB
Cached: 34216 kB
SwapCached: 643808 kB
Active: 999316 kB
Inactive: 12088 kB
HighTotal: 1179648 kB
HighFree: 576 kB
LowTotal: 894220 kB
LowFree: 1952 kB
SwapTotal: 3583968 kB
SwapFree: 2559796 kB
Dirty: 0 kB
Writeback: 3052 kB
Mapped: 1001208 kB
Slab: 23932 kB
Committed_AS: 1979784 kB
PageTables: 4840 kB
VmallocTotal: 114680 kB
VmallocUsed: 7448 kB
VmallocChunk: 107232 kB
2) sysrq-M:
===========
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Free pages: 2120kB (512kB HighMem)
Active:1067 inactive:93 dirty:0 writeback:0 unstable:0 free:530
DMA free:176kB min:16kB low:32kB high:48kB active:884kB inactive:0kB
Normal free:1432kB min:936kB low:1872kB high:2808kB active:1376kB inactive:372kB
HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB inactive:0kB
DMA: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 176kB
Normal: 248*4kB 3*8kB 0*16kB 1*32kB 6*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1432kB
HighMem: 0*4kB 0*8kB 0*16kB 2*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB
Swap cache: add 1726105, delete 1726052, find 1388170/1627421, race 19+488
Free swap: 2195688kB
524288 pages of RAM
294912 pages of HIGHMEM
5821 reserved pages
993 pages shared
54 pages swap cached
3) sysrq-T:
===========
http://sysadminday.org.ru/2.6.3-lockup/20040226/sysrq-T
3) `vmstat 30':
===============
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 19 1255096 1952 1996 19920 426 1763 505 1778 1068 172 0 1 0 99
0 24 1260156 1944 2028 19816 374 1650 463 1670 1067 165 0 1 0 99
0 18 1266576 1880 2000 18960 372 1835 449 1847 1072 177 0 1 0 99
0 19 1274696 2904 1960 17892 366 2002 422 2007 1054 179 0 1 0 99
0 14 1279000 2896 1916 17356 203 1683 243 1693 1037 137 0 1 0 99
0 19 1288068 2472 1912 16608 180 2074 220 2085 1048 138 0 1 0 99
1 13 1294388 2152 1932 16404 253 1841 302 1849 1037 117 0 1 0 99
0 17 1301552 2328 1956 15684 318 1866 375 1880 1037 162 0 1 0 99
0 18 1307280 2448 1956 15024 331 1697 408 1714 1041 155 0 1 0 99
0 20 1312696 2184 1852 13948 480 1720 549 1732 1041 166 0 1 0 99
0 21 1321756 2308 1952 13400 435 2012 572 2028 1048 191 0 1 0 99
0 20 1330740 2372 1840 12152 509 1920 564 1939 1045 162 0 1 0 99
0 19 1336432 2616 1844 11252 513 1697 568 1704 1043 135 0 1 0 99
0 20 1342256 2364 1896 10704 520 1810 573 1816 1042 185 0 1 0 99
0 17 1350608 2868 1796 10112 368 2079 412 2092 1040 133 0 1 0 99
0 19 1356100 2176 1988 9120 401 1668 533 1677 1039 161 0 1 0 99
0 20 1359692 2248 2004 8876 369 1500 482 1514 1039 169 0 1 0 99
0 19 1364868 2696 1904 8428 455 1643 604 1658 1038 172 0 1 0 99
0 20 1371124 2876 1920 7212 537 2133 745 2147 1312 209 0 1 0 99
0 20 1378172 3192 1832 6036 614 1623 793 1631 1042 180 0 1 0 99
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 12:19 ` Sergey S. Kostyliov
@ 2004-02-26 12:53 ` Andrew Morton
2004-02-26 13:11 ` Andrew Morton
2004-02-26 14:30 ` Sergey S. Kostyliov
0 siblings, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2004-02-26 12:53 UTC (permalink / raw)
To: Sergey S. Kostyliov; +Cc: linux-kernel, gluk, anton, mfedyk
"Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> Yet another lockup has just occurred. I could be wrong but from the
> /proc/meminfo content it doesn't looks like memory leak (neither kernel
> nor userspace), doesn't it?
I think it's a kernel leak.
> Thu Feb 26 05:00:15 MSK 2004
> MemTotal: 2073868 kB
> MemFree: 2528 kB
> Buffers: 2180 kB
> Cached: 34216 kB
> SwapCached: 643808 kB
> Active: 999316 kB
> Inactive: 12088 kB
> HighTotal: 1179648 kB
> HighFree: 576 kB
> LowTotal: 894220 kB
> LowFree: 1952 kB
> SwapTotal: 3583968 kB
> SwapFree: 2559796 kB
> Dirty: 0 kB
> Writeback: 3052 kB
> Mapped: 1001208 kB
> Slab: 23932 kB
> Committed_AS: 1979784 kB
> PageTables: 4840 kB
> VmallocTotal: 114680 kB
> VmallocUsed: 7448 kB
> VmallocChunk: 107232 kB
A gig of mapped memory, most of it in swapcache. That's probably all
highmem. Only a gig of memory on the page LRU. Where is the rest? Lost.
Almost no pagecache at all, slab is small.
> 3) sysrq-T:
> ===========
> http://sysadminday.org.ru/2.6.3-lockup/20040226/sysrq-T
hm, you have 34 instances of crond running. How odd.
> 3) `vmstat 30':
> ===============
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 0 19 1255096 1952 1996 19920 426 1763 505 1778 1068 172 0 1 0 99
> 0 24 1260156 1944 2028 19816 374 1650 463 1670 1067 165 0 1 0 99
Again, all your memory has vanished.
I'd say that we've leaked everything in lowmem and everyone is stuck trying
to reclaim some lowmem memory. Not sure why the oom-killer didn't do
anything. I haven't tested it in a year - maybe it broke.
So. What are you using which is different from everyone else? DAC960 I
see. What about firewall setups, NIC drivers, RAID/MD/etc? Anything in
there which isn't a mainstream thing?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 12:53 ` Andrew Morton
@ 2004-02-26 13:11 ` Andrew Morton
2004-02-26 14:37 ` Dave Jones
2004-02-26 14:30 ` Sergey S. Kostyliov
1 sibling, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2004-02-26 13:11 UTC (permalink / raw)
To: rathamahata, linux-kernel, gluk, anton, mfedyk
Andrew Morton <akpm@osdl.org> wrote:
>
> Not sure why the oom-killer didn't do anything.
There's still free swap space. The oom-killer has problems.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 12:53 ` Andrew Morton
2004-02-26 13:11 ` Andrew Morton
@ 2004-02-26 14:30 ` Sergey S. Kostyliov
2004-02-26 20:03 ` Andrew Morton
1 sibling, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-26 14:30 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, gluk, anton, mfedyk
On Thursday 26 February 2004 15:53, Andrew Morton wrote:
> "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> >
> > Yet another lockup has just occurred. I could be wrong but from the
> > /proc/meminfo content it doesn't looks like memory leak (neither kernel
> > nor userspace), doesn't it?
>
> I think it's a kernel leak.
>
> > Thu Feb 26 05:00:15 MSK 2004
> > MemTotal: 2073868 kB
> > MemFree: 2528 kB
> > Buffers: 2180 kB
> > Cached: 34216 kB
> > SwapCached: 643808 kB
> > Active: 999316 kB
> > Inactive: 12088 kB
> > HighTotal: 1179648 kB
> > HighFree: 576 kB
> > LowTotal: 894220 kB
> > LowFree: 1952 kB
> > SwapTotal: 3583968 kB
> > SwapFree: 2559796 kB
> > Dirty: 0 kB
> > Writeback: 3052 kB
> > Mapped: 1001208 kB
> > Slab: 23932 kB
> > Committed_AS: 1979784 kB
> > PageTables: 4840 kB
> > VmallocTotal: 114680 kB
> > VmallocUsed: 7448 kB
> > VmallocChunk: 107232 kB
>
> A gig of mapped memory, most of it in swapcache. That's probably all
> highmem. Only a gig of memory on the page LRU. Where is the rest? Lost.
>
> Almost no pagecache at all, slab is small.
>
> > 3) sysrq-T:
> > ===========
> > http://sysadminday.org.ru/2.6.3-lockup/20040226/sysrq-T
>
> hm, you have 34 instances of crond running. How odd.
>
> > 3) `vmstat 30':
> > ===============
> > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> > r b swpd free buff cache si so bi bo in cs us sy id wa
> > 0 19 1255096 1952 1996 19920 426 1763 505 1778 1068 172 0 1 0 99
> > 0 24 1260156 1944 2028 19816 374 1650 463 1670 1067 165 0 1 0 99
>
> Again, all your memory has vanished.
>
> I'd say that we've leaked everything in lowmem and everyone is stuck trying
> to reclaim some lowmem memory. Not sure why the oom-killer didn't do
> anything. I haven't tested it in a year - maybe it broke.
>
> So. What are you using which is different from everyone else? DAC960 I
> see. What about firewall setups, NIC drivers, RAID/MD/etc? Anything in
> there which isn't a mainstream thing?
Iptables (ipt_REJECT,ipt_state,ip_conntrack,ipt_state,iptable_filter modules)
is used as firewall.
I think NICs are pretty usual:
00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
00:05.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
handled by Intel e100 driver.
Only plain partitions (there is no md, dm or something like this):
[rathamahata@ope rathamahata]$ mount
/dev/rd/host0/target0/part1 on / type reiserfs (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/rd/host0/target1/part2 on /usr/local type reiserfs (rw)
/dev/rd/host0/target3/part1 on /var type reiserfs (rw,noatime,nodiratime)
/dev/rd/host0/target7/part1 on /var/www/html/fo type reiserfs (rw,noatime,nodiratime)
/dev/rd/host0/target2/part1 on /home type reiserfs (rw,noatime,nodiratime)
/dev/rd/host0/target4/part1 on /var/lib/innodb/1 type reiserfs (rw,noatime,nodiratime,notail)
/dev/rd/host0/target5/part1 on /var/lib/innodb/2 type reiserfs (rw,noatime,nodiratime,notail)
/dev/rd/host0/target6/part1 on /var/lib/oracle/db04 type reiserfs (rw,noatime,nodiratime,notail)
sysfs on /sys type sysfs (rw)
Here is a .config:
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_KMOD=y
CONFIG_X86_PC=y
CONFIG_MPENTIUMIII=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_SMP=y
CONFIG_NR_CPUS=2
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_PM=y
CONFIG_ACPI_BOOT=y
CONFIG_APM=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_NAMES=y
CONFIG_BINFMT_ELF=y
CONFIG_BLK_DEV_DAC960=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_NETFILTER=y
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IPV6_SCTP__=y
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
CONFIG_NET_PCI=y
CONFIG_E100=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_VIDEO_SELECT=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_EXT2_FS=m
CONFIG_EXT3_FS=m
CONFIG_EXT3_FS_XATTR=y
CONFIG_JBD=m
CONFIG_FS_MBCACHE=m
CONFIG_REISERFS_FS=y
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_DEVFS_FS=y
CONFIG_DEVPTS_FS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_PC=y
>
>
>
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 13:11 ` Andrew Morton
@ 2004-02-26 14:37 ` Dave Jones
2004-02-26 15:37 ` Arjan van de Ven
0 siblings, 1 reply; 24+ messages in thread
From: Dave Jones @ 2004-02-26 14:37 UTC (permalink / raw)
To: Andrew Morton; +Cc: rathamahata, linux-kernel, gluk, anton, mfedyk
On Thu, Feb 26, 2004 at 05:11:35AM -0800, Andrew Morton wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > Not sure why the oom-killer didn't do anything.
>
> There's still free swap space. The oom-killer has problems.
That sounds odd. Surely if we have free swap, we don't
want the oom-killer to do anything ?
Dave
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 14:37 ` Dave Jones
@ 2004-02-26 15:37 ` Arjan van de Ven
0 siblings, 0 replies; 24+ messages in thread
From: Arjan van de Ven @ 2004-02-26 15:37 UTC (permalink / raw)
To: Dave Jones; +Cc: Andrew Morton, rathamahata, linux-kernel, gluk, anton, mfedyk
[-- Attachment #1: Type: text/plain, Size: 560 bytes --]
On Thu, 2004-02-26 at 15:37, Dave Jones wrote:
> On Thu, Feb 26, 2004 at 05:11:35AM -0800, Andrew Morton wrote:
> > Andrew Morton <akpm@osdl.org> wrote:
> > >
> > > Not sure why the oom-killer didn't do anything.
> >
> > There's still free swap space. The oom-killer has problems.
>
> That sounds odd. Surely if we have free swap, we don't
> want the oom-killer to do anything ?
with highmem it's not so easy :)
the lowzone can be entirely pinned by pagetables and such and the
highmem zone can be free... and still you want to oomkill.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 14:30 ` Sergey S. Kostyliov
@ 2004-02-26 20:03 ` Andrew Morton
2004-02-28 14:56 ` Sergey S. Kostyliov
0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2004-02-26 20:03 UTC (permalink / raw)
To: Sergey S. Kostyliov; +Cc: linux-kernel, gluk, anton, mfedyk
"Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> > So. What are you using which is different from everyone else? DAC960 I
> > see. What about firewall setups, NIC drivers, RAID/MD/etc? Anything in
> > there which isn't a mainstream thing?
>
> Iptables (ipt_REJECT,ipt_state,ip_conntrack,ipt_state,iptable_filter modules)
> is used as firewall.
>
> I think NICs are pretty usual:
> 00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
> 00:05.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
> handled by Intel e100 driver.
>
> Only plain partitions (there is no md, dm or something like this):
> [rathamahata@ope rathamahata]$ mount
> /dev/rd/host0/target0/part1 on / type reiserfs (rw)
> none on /proc type proc (rw)
> none on /dev/pts type devpts (rw,gid=5,mode=620)
> /dev/rd/host0/target1/part2 on /usr/local type reiserfs (rw)
> /dev/rd/host0/target3/part1 on /var type reiserfs (rw,noatime,nodiratime)
> /dev/rd/host0/target7/part1 on /var/www/html/fo type reiserfs (rw,noatime,nodiratime)
> /dev/rd/host0/target2/part1 on /home type reiserfs (rw,noatime,nodiratime)
> /dev/rd/host0/target4/part1 on /var/lib/innodb/1 type reiserfs (rw,noatime,nodiratime,notail)
> /dev/rd/host0/target5/part1 on /var/lib/innodb/2 type reiserfs (rw,noatime,nodiratime,notail)
> /dev/rd/host0/target6/part1 on /var/lib/oracle/db04 type reiserfs (rw,noatime,nodiratime,notail)
> sysfs on /sys type sysfs (rw)
OK, thanks. Is there any possibility that you can run without iptables for
a while, see if that fixes it?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems
2004-02-26 20:03 ` Andrew Morton
@ 2004-02-28 14:56 ` Sergey S. Kostyliov
2004-04-08 9:08 ` 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) Sergey S. Kostyliov
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-02-28 14:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, gluk, anton, mfedyk
On Thursday 26 February 2004 23:03, Andrew Morton wrote:
<cut>
> OK, thanks. Is there any possibility that you can run without iptables for
> a while, see if that fixes it?
I recompiled 2.6.3 without iptables support, unfortunately it doesn't
solve the problem, machine still hangs.
1) sysrq-M:
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
cpu 1 hot: low 2, high 6, batch 1
cpu 1 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
cpu 1 hot: low 32, high 96, batch 16
cpu 1 cold: low 0, high 32, batch 16
Free pages: 3276kB (512kB HighMem)
Active:820 inactive:195 dirty:0 writeback:0 unstable:0 free:819
DMA free:1348kB min:16kB low:32kB high:48kB active:316kB inactive:0kB
Normal free:1416kB min:936kB low:1872kB high:2808kB active:1388kB inactive:348kB
HighMem free:512kB min:512kB low:1024kB high:1536kB active:1604kB inactive:404kB
DMA: 75*4kB 69*8kB 21*16kB 3*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1348kB
Normal: 98*4kB 20*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1416kB
HighMem: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB
Swap cache: add 342862, delete 342774, find 15349/23980, race 14+29
Free swap: 2473044kB
524288 pages of RAM
294912 pages of HIGHMEM
5814 reserved pages
899 pages shared
89 pages swap cached
2) /proc/meminfo before a lockup
Sat Feb 28 06:42:33 MSK 2004
MemTotal: 2073896 kB
MemFree: 3452 kB
Buffers: 2240 kB
Cached: 29648 kB
SwapCached: 21084 kB
Active: 627896 kB
Inactive: 17340 kB
HighTotal: 1179648 kB
HighFree: 576 kB
LowTotal: 894248 kB
LowFree: 2876 kB
SwapTotal: 3583968 kB
SwapFree: 3095996 kB
Dirty: 0 kB
Writeback: 14104 kB
Mapped: 625540 kB
Slab: 19044 kB
Committed_AS: 1767368 kB
PageTables: 4316 kB
VmallocTotal: 114680 kB
VmallocUsed: 7448 kB
VmallocChunk: 107232 kB
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems)
2004-02-28 14:56 ` Sergey S. Kostyliov
@ 2004-04-08 9:08 ` Sergey S. Kostyliov
2004-04-09 7:17 ` 2.6.X kernel memory leak? Sergey S. Kostyliov
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-04-08 9:08 UTC (permalink / raw)
To: linux-kernel; +Cc: Anton Kovalenko
Hello all,
On Saturday 28 February 2004 17:56, Sergey S. Kostyliov wrote:
> On Thursday 26 February 2004 23:03, Andrew Morton wrote:
> <cut>
> > OK, thanks. Is there any possibility that you can run without iptables for
> > a while, see if that fixes it?
>
> I recompiled 2.6.3 without iptables support, unfortunately it doesn't
> solve the problem, machine still hangs.
It looks like problem hasn't gone away in the last kernels. The visible
symptoms haven't changed: machine is pingable, tcp ports which were in
LISTEN state remains to be in LISTEN after lockup, nothing else.
The last one is for different machine than in my previous reports,
so I suspect this is not a hardware issue. Kernel is 2.6.5-aa3 but
I believe Andrea's changes is not related to this problem.
sysrq-M
http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-M
sysrq-T
http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-T
.config
http://sysadminday.org.ru/2.6.X-lockup/terror/.config
`lspci -vv'
http://sysadminday.org.ru/2.6.X-lockup/terror/lspci_-vv
`dmesg'
http://sysadminday.org.ru/2.6.X-lockup/terror/dmesg
/etc/fstab
http://sysadminday.org.ru/2.6.X-lockup/terror/fstab
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.X kernel memory leak?
2004-04-08 9:08 ` 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) Sergey S. Kostyliov
@ 2004-04-09 7:17 ` Sergey S. Kostyliov
2004-04-09 9:09 ` Andrew Morton
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-04-09 7:17 UTC (permalink / raw)
To: linux-kernel; +Cc: Anton Kovalenko
On Thursday 08 April 2004 13:08, Sergey S. Kostyliov wrote:
> Hello all,
>
> On Saturday 28 February 2004 17:56, Sergey S. Kostyliov wrote:
> > On Thursday 26 February 2004 23:03, Andrew Morton wrote:
> > <cut>
> > > OK, thanks. Is there any possibility that you can run without iptables for
> > > a while, see if that fixes it?
> >
> > I recompiled 2.6.3 without iptables support, unfortunately it doesn't
> > solve the problem, machine still hangs.
>
> It looks like problem hasn't gone away in the last kernels. The visible
> symptoms haven't changed: machine is pingable, tcp ports which were in
> LISTEN state remains to be in LISTEN after lockup, nothing else.
>
> The last one is for different machine than in my previous reports,
> so I suspect this is not a hardware issue. Kernel is 2.6.5-aa3 but
> I believe Andrea's changes is not related to this problem.
>
> sysrq-M
> http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-M
>
> sysrq-T
> http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-T
>
> .config
> http://sysadminday.org.ru/2.6.X-lockup/terror/.config
>
> `lspci -vv'
> http://sysadminday.org.ru/2.6.X-lockup/terror/lspci_-vv
>
> `dmesg'
> http://sysadminday.org.ru/2.6.X-lockup/terror/dmesg
>
> /etc/fstab
> http://sysadminday.org.ru/2.6.X-lockup/terror/fstab
>
>
And here is part of sysrq-T for the third machine, which have just locked up,
kernel is 2.6.5-rc3-aa2.
multilog S F7BF3D60 0 3302 3288 (NOTLB)
f7b83ed8 00000082 00000001 f7bf3d60 f7b83e9c c011a771 f7a4db80 00000000
00000003 f7bf3d58 f7b82000 00000282 f7aaece0 00000000 0804ea70 f7aaece0
f7aaed00 c180dbe0 0000111c 19e0b9c6 0001faed f7a89a70 f7b83f00 f7a6bb80
Call Trace:
[<c011a771>] __wake_up_common+0x31/0x60
[<c016ee7c>] pipe_wait+0x7c/0xa0
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c016f07a>] pipe_readv+0x1da/0x2c0
[<c016f180>] pipe_read+0x20/0x30
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
qmail-lspawn S C030D340 0 3325 3301 3326 (NOTLB)
f74c5ea4 00000082 c0117444 c030d340 00000246 01470f60 f7cd8b80 c030d6d0
00000000 c030d6c0 c1382d20 00000000 00000000 19c98941 0001faed f7aaece0
f7aaed00 c1815be0 00004ec0 19ca1051 0001faed f7bb3a10 00000010 f74c5eb4
Call Trace:
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0175a90>] __pollwait+0x80/0xd0
[<c016f5d2>] pipe_poll+0x32/0x90
[<c0175da2>] do_select+0x1c2/0x330
[<c0175a10>] __pollwait+0x0/0xd0
[<c017620e>] sys_select+0x2de/0x4d0
[<c016030f>] filp_close+0x4f/0x80
[<c01073c9>] sysenter_past_esp+0x52/0x71
qmail-rspawn S C030D300 0 3326 3301 3327 3325 (NOTLB)
f74d9ea4 00000082 f74d8000 c030d300 00000246 01468f60 f7a9eb80 c181756c
f74d9e58 c030d680 c11654c0 00000000 00000000 c0118397 00000000 f7aaece0
f7aaed00 c180dbe0 0000f336 ad9386e9 000010b2 f747bad0 cbf9ff0c f74d9eb4
Call Trace:
[<c0118397>] recalc_task_prio+0x97/0x1c0
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0175a90>] __pollwait+0x80/0xd0
[<c016f5d2>] pipe_poll+0x32/0x90
[<c0175da2>] do_select+0x1c2/0x330
[<c0175a10>] __pollwait+0x0/0xd0
[<c017620e>] sys_select+0x2de/0x4d0
[<c016030f>] filp_close+0x4f/0x80
[<c01073c9>] sysenter_past_esp+0x52/0x71
qmail-clean S 00000012 0 3327 3301 3326 (NOTLB)
f7445ed8 00000082 f7445f00 00000012 c01bfa2f 00000000 f7a9e280 f7445ea8
c0118397 b1f8808e 3cc9b81f f7a9e940 19ec7e20 0001faed c180dbe0 e8af92d0
e8af92f0 c1815be0 00008b3e 19ecb28e 0001faed f747b500 00000082 f74dbf00
Call Trace:
[<c01bfa2f>] do_journal_end+0xcf/0xbe0
[<c0118397>] recalc_task_prio+0x97/0x1c0
[<c016ee7c>] pipe_wait+0x7c/0xa0
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c016f07a>] pipe_readv+0x1da/0x2c0
[<c016f42d>] pipe_writev+0x29d/0x360
[<c016f180>] pipe_read+0x20/0x30
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
proftpd D 00000000 0 3328 1 3364 3282 (NOTLB)
f7413d34 00000086 00000000 00000000 00000000 00000000 f7a9edc0 00000000
00000000 00000000 00000000 00000000 f7412000 00000000 00000246 f73d0da0
f73d0dc0 c180dbe0 000005ed 980d0112 000222af f747af30 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c0144e8d>] do_page_cache_readahead+0x1cd/0x280
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c013e49f>] filemap_nopage+0x17f/0x460
[<c015004b>] do_no_page+0xdb/0x680
[<c013cc31>] unlock_page+0x11/0x60
[<c014f435>] do_wp_page+0x4c5/0x570
[<c015081c>] handle_mm_fault+0xec/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c010f555>] convert_fxsr_from_user+0x15/0xe0
[<c010f92c>] restore_i387+0x8c/0x90
[<c01066b4>] restore_sigcontext+0x114/0x130
[<c01067b2>] sys_sigreturn+0xe2/0x150
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
sshd D 00000000 0 3364 1 3238 3391 3328 (NOTLB)
f73f7d18 00000082 00000000 00000000 00000000 00000000 f7a5d040 00000000
00000000 00000000 00000000 00000000 f73f6000 00000000 00000246 00000000
ffffffff c1815be0 00000149 de751563 000222af f740cf50 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0175a90>] __pollwait+0x80/0xd0
[<c028bd2d>] tcp_poll+0x1d/0x170
[<c0175a04>] do_select+0x1e7/0x330
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c017007b>] get_write_access+0x4b/0xe0
[<c01cf8e0>] __copy_to_user_ll+0x40/0x60
[<c01762e7>] sys_select+0x3b7/0x4d0
[<c010f92c>] restore_i387+0x8c/0x90
[<c01066b4>] restore_sigcontext+0x114/0x130
[<c01073c9>] sysenter_past_esp+0x52/0x71
cron D 00000000 0 3391 1 6677 3401 3364 (NOTLB)
f73cbd34 00000082 00000000 00000000 00000000 00000000 f7bd5280 00000000
00000000 00000000 00000000 00000000 f73ca000 00000000 00000246 00000000
ffffffff c180dbe0 00000180 980ff35e 000222af f73d0f70 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c011a2c9>] schedule+0x389/0x7a0
[<c0144e8d>] do_page_cache_readahead+0x1cd/0x280
[<c013e49f>] filemap_nopage+0x17f/0x460
[<c015004b>] do_no_page+0xdb/0x680
[<c015081c>] handle_mm_fault+0xec/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c016c10b>] sys_stat64+0x2b/0x30
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
agetty D 00000000 0 3401 1 3402 3391 (NOTLB)
f7badc54 00000086 00000000 00000000 00000000 00000000 f7b07dc0 00000000
00000000 00000000 00000000 00000000 f7bac000 c180e540 c01287ac 00000000
ffffffff c180dbe0 00018704 9823a1c3 000222af f7a88ed0 00000000 c030dc20
Call Trace:
[<c01287ac>] __mod_timer+0x23c/0x370
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c011a2c9>] schedule+0x389/0x7a0
[<c01bee6f>] journal_end+0xf/0x20
[<c01aeac7>] reiserfs_dirty_inode+0x77/0x110
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c025007b>] sg_res_in_use+0x6b/0x80
[<c01cf8e0>] __copy_to_user_ll+0x40/0x60
[<c01fa91d>] read_chan+0x5dd/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c014e246>] unmap_vmas+0xf6/0x310
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
agetty S F7D2E800 0 3402 1 3403 3401 (NOTLB)
f7a87e58 00000082 00000000 f7d2e800 f7a87e20 f78200d8 f7b07b80 f7820114
c01bee6f 00000000 c01aeac7 000000ff 00000000 c02d88f7 00000000 00000001
0064d901 c180dbe0 000850e1 ef9caa45 00000013 f7a414e0 00000286 f7d2e800
Call Trace:
[<c01bee6f>] journal_end+0xf/0x20
[<c01aeac7>] reiserfs_dirty_inode+0x77/0x110
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0205ba3>] do_con_write+0x2b3/0x740
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c014e246>] unmap_vmas+0xf6/0x310
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
agetty S 00003500 0 3403 1 3404 3402 (NOTLB)
f73e5e58 00000082 00000000 00003500 175c6fc1 f7de0844 f7b07940 e05da8c0
00000011 00000000 f7de9220 c192f000 c0283e93 c192f000 00000000 f7de9220
f7de0830 c180dbe0 000b78d0 ef8aa30a 00000013 f740c980 00000286 f7d2e800
Call Trace:
[<c0283e93>] ip_local_deliver+0xd3/0x1f0
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0205ba3>] do_con_write+0x2b3/0x740
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c014e246>] unmap_vmas+0xf6/0x310
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
agetty S 00000000 0 3404 1 3405 3403 (NOTLB)
f74c3e58 00000082 000001ff 00000000 00000003 00000000 f7a4d280 00000000
00020000 00000000 f74c3e6c 000000ff 00000000 00000000 00000000 00000003
00000286 c1815be0 0007e435 ef918c51 00000013 f7bb3440 00000286 00000000
Call Trace:
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0205ba3>] do_con_write+0x2b3/0x740
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c014e246>] unmap_vmas+0xf6/0x310
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
agetty S 00000000 0 3405 1 3406 3404 (NOTLB)
f73cfe58 00000086 000001ff 00000000 00000004 00000000 f7a6a4c0 00000000
00020000 00000000 f73cfe6c 000000ff 00000000 00000000 00000000 00000004
00000286 c180dbe0 00084d71 efb34714 00000013 f73d1b10 00000286 00000000
Call Trace:
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0205ba3>] do_con_write+0x2b3/0x740
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c014e246>] unmap_vmas+0xf6/0x310
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
agetty S F7D2E800 0 3406 1 4611 3405 (NOTLB)
f73e3e58 00000082 00000000 f7d2e800 f73e3e20 f78200d8 f7cd8040 f7820114
c01bee6f 00000000 c01aeac7 000000ff 00000000 c02d88f7 00000000 00000001
0064d901 c1815be0 0007cc5c efa78898 00000013 f740c3b0 00000286 f7d2e800
Call Trace:
[<c01bee6f>] journal_end+0xf/0x20
[<c01aeac7>] reiserfs_dirty_inode+0x77/0x110
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0205ba3>] do_con_write+0x2b3/0x740
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c014e246>] unmap_vmas+0xf6/0x310
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
ntpd D 00000000 0 4611 1 3406 (NOTLB)
ce397bd0 00000082 00000000 00000000 00000000 00000000 f7bd5b80 00000000
00000000 00000000 00000000 00000000 ce396000 00000000 00000246 f7bb2100
f7bb2120 c1815be0 0000018d f19a9e2b 000222af cc3de3b0 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c013cf9d>] find_lock_page+0x4d/0x270
[<c013f42c>] generic_file_aio_write_nolock+0x33c/0xba0
[<c0148054>] mark_page_accessed+0x34/0x40
[<c0164052>] __find_get_block+0x62/0xc0
[<c0164052>] __find_get_block+0x62/0xc0
[<c01b6c92>] search_by_key+0x642/0xe10
[<c013fced>] generic_file_write_nolock+0x5d/0x80
[<c019fa27>] reiserfs_find_entry+0x97/0x150
[<c013fddf>] generic_file_write+0x3f/0x60
[<c01aa31f>] reiserfs_file_write+0x7ff/0x810
[<c019fbfc>] reiserfs_lookup+0x11c/0x1f0
[<c0130dd9>] in_group_p+0x39/0x70
[<c016ff29>] vfs_permission+0x79/0x140
[<c017a24c>] dput+0x1c/0x3a0
[<c01701fa>] path_release+0xa/0x30
[<c0171be7>] open_namei+0xb7/0x3e0
[<c015fc6d>] filp_open+0x2d/0x60
[<c0160e80>] vfs_write+0xb0/0x110
[<c0160f78>] sys_write+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
httpd D 00000000 0 12739 3100 12740 (NOTLB)
ee10dc70 00000086 00000000 00000000 00000000 00000000 f7a9c4c0 00000000
00000000 00000000 00000000 00000000 ee10c000 9821ec28 000222af eff200a0
eff200c0 c180dbe0 00000644 9821eff8 000222af f714e3f0 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02688fa>] lock_sock+0x6a/0xc0
[<c0268e09>] __kfree_skb+0x79/0x100
[<c02905fc>] wait_for_connect+0xec/0x110
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c01cf529>] __get_user_4+0x11/0x17
[<c0264955>] move_addr_to_user+0x25/0x90
[<c017dab0>] new_inode+0x10/0xc0
[<c0265f7c>] sys_accept+0xec/0x160
[<c028fd7b>] tcp_close+0x36b/0x720
[<c0266b05>] sys_socketcall+0xf5/0x2a0
[<c01073c9>] sysenter_past_esp+0x52/0x71
httpd D 00000000 0 12740 3100 12741 12739 (NOTLB)
dc433c70 00000086 00000000 00000000 00000000 00000000 e45efdc0 00000000
00000000 00000000 00000000 00000000 dc432000 00000000 00000246 f714e220
f714e240 c180dbe0 0000022d 981ea476 000222af eff219b0 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02688fa>] lock_sock+0x6a/0xc0
[<c0268e09>] __kfree_skb+0x79/0x100
[<c02905fc>] wait_for_connect+0xec/0x110
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c01cf529>] __get_user_4+0x11/0x17
[<c0264955>] move_addr_to_user+0x25/0x90
[<c017dab0>] new_inode+0x10/0xc0
[<c0265f7c>] sys_accept+0xec/0x160
[<c028fd7b>] tcp_close+0x36b/0x720
[<c0266b05>] sys_socketcall+0xf5/0x2a0
[<c01073c9>] sysenter_past_esp+0x52/0x71
httpd D 00000000 0 12741 3100 12742 12740 (NOTLB)
f580dc70 00000086 00000000 00000000 00000000 00000000 f7a6a700 00000000
00000000 00000000 00000000 00000000 f580c000 00000000 00000246 00000000
ffffffff c1815be0 00000182 fdf459c5 000222af f7a63a90 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02688fa>] lock_sock+0x6a/0xc0
[<c0268e09>] __kfree_skb+0x79/0x100
[<c02905fc>] wait_for_connect+0xec/0x110
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c01cf529>] __get_user_4+0x11/0x17
[<c0264955>] move_addr_to_user+0x25/0x90
[<c017dab0>] new_inode+0x10/0xc0
[<c0265f7c>] sys_accept+0xec/0x160
[<c028fd7b>] tcp_close+0x36b/0x720
[<c0266b05>] sys_socketcall+0xf5/0x2a0
[<c01073c9>] sysenter_past_esp+0x52/0x71
httpd R running 0 12742 3100 12743 12741 (NOTLB)
httpd D 00000000 0 12743 3100 13713 12742 (NOTLB)
f7501c70 00000086 00000000 00000000 00000000 00000000 f7a9e700 00000000
00000000 00000000 00000000 00000000 f7500000 00000000 00000246 f73d07d0
f73d07f0 c1815be0 0000015c 0315f596 000222b0 f7c9d9f0 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02688fa>] lock_sock+0x6a/0xc0
[<c0268e09>] __kfree_skb+0x79/0x100
[<c02905fc>] wait_for_connect+0xec/0x110
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c01cf529>] __get_user_4+0x11/0x17
[<c0264955>] move_addr_to_user+0x25/0x90
[<c017dab0>] new_inode+0x10/0xc0
[<c0265f7c>] sys_accept+0xec/0x160
[<c028fd7b>] tcp_close+0x36b/0x720
[<c0266b05>] sys_socketcall+0xf5/0x2a0
[<c01073c9>] sysenter_past_esp+0x52/0x71
httpd D 00000000 0 13713 3100 19047 12743 (NOTLB)
df9a5c70 00000082 00000000 00000000 00000000 00000000 e45ef280 00000000
00000000 00000000 00000000 00000000 df9a4000 00000000 00000246 00000000
ffffffff c180dbe0 000001f6 976392ca 000222af f714e9c0 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02688fa>] lock_sock+0x6a/0xc0
[<c0268e09>] __kfree_skb+0x79/0x100
[<c02905fc>] wait_for_connect+0xec/0x110
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c01cf529>] __get_user_4+0x11/0x17
[<c0264955>] move_addr_to_user+0x25/0x90
[<c017dab0>] new_inode+0x10/0xc0
[<c0265f7c>] sys_accept+0xec/0x160
[<c014e121>] unmap_page_range+0x31/0x60
[<c014e246>] unmap_vmas+0xf6/0x310
[<c01488ff>] __pagevec_lru_add_active+0x13f/0x1b0
[<c017a24c>] dput+0x1c/0x3a0
[<c0161d39>] __fput+0xb9/0x120
[<c0266b05>] sys_socketcall+0xf5/0x2a0
[<c0152ce4>] do_munmap+0x154/0x1b0
[<c01073c9>] sysenter_past_esp+0x52/0x71
httpd D 00000000 0 19047 3100 13713 (NOTLB)
c5e5dc70 00000082 00000000 00000000 00000000 00000000 f7a9e040 00000000
00000000 00000000 00000000 00000000 c5e5c000 00000000 00000246 f740cd80
f740cda0 c1815be0 00000160 0cfd8041 000222b0 eff20840 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02688fa>] lock_sock+0x6a/0xc0
[<c0268e09>] __kfree_skb+0x79/0x100
[<c02905fc>] wait_for_connect+0xec/0x110
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c01cf529>] __get_user_4+0x11/0x17
[<c0264955>] move_addr_to_user+0x25/0x90
[<c017dab0>] new_inode+0x10/0xc0
[<c0265f7c>] sys_accept+0xec/0x160
[<c028fd7b>] tcp_close+0x36b/0x720
[<c0266b05>] sys_socketcall+0xf5/0x2a0
[<c01073c9>] sysenter_past_esp+0x52/0x71
sshd S C3331DA4 0 3238 3364 3247 3756 (NOTLB)
c3331d7c 00000086 c3331e50 c3331da4 00000000 c3331e50 c1fa6dc0 00000000
000475c4 00000000 fac86840 00000000 00000000 c1064f80 00000001 c3331e50
c01b637e c1815be0 0004594a 64ab66fa 0001d39c f7aafa50 c3331da8 e06aaa38
Call Trace:
[<c01b637e>] pathrelse+0x1e/0x30
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c013fced>] generic_file_write_nolock+0x5d/0x80
[<c02c090a>] unix_stream_data_wait+0xfa/0x180
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c02c1003>] unix_stream_recvmsg+0x673/0x710
[<c01aa31f>] reiserfs_file_write+0x7ff/0x810
[<c0265030>] sock_aio_read+0xb0/0xd0
[<c0160bcd>] do_sync_read+0x6d/0xb0
[<c01f556e>] release_dev+0x33e/0x7e0
[<c015fdec>] dentry_open+0x14c/0x220
[<c015fc8f>] filp_open+0x4f/0x60
[<c0160cf7>] vfs_read+0xe7/0x110
[<c0161d39>] __fput+0xb9/0x120
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sshd D 00000000 0 3247 3238 3248 (NOTLB)
c309dd18 00000086 00000000 00000000 00000000 00000000 c1fa64c0 00000000
00000000 00000000 00000000 00000000 c309c000 00000000 00000246 00000000
ffffffff c1815be0 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c01fc19a>] pty_chars_in_buffer+0x1a/0x40
[<c01fc175>] pty_write_room+0x25/0x30
[<c0175a04>] poll_freewait+0x44/0x50
[<c0175dc7>] do_select+0x1e7/0x330
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c017007b>] get_write_access+0x4b/0xe0
[<c01cf8e0>] __copy_to_user_ll+0x40/0x60
[<c01762e7>] sys_select+0x3b7/0x4d0
[<c0160f78>] sys_write+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
bash S C030D340 0 3248 3247 (NOTLB)
c757be58 00000086 00000010 c030d340 00000246 01470f60 f7a5d4c0 00000000
00000000 00000010 c1817708 00000000 f406c260 c180dbe0 0001e5df 4853c415 0001e873 c19633c0 c0129026 00000001
Call Trace:
[<c0129026>] update_wall_time+0x16/0x40
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c01f81b4>] opost_block+0xf4/0x1b0
[<c01facaa>] read_chan+0x96a/0xb00
[<c014f435>] do_wp_page+0x4c5/0x570
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sshd S EDEF5DA4 0 3756 3364 3759 3914 3238 (NOTLB)
edef5d7c 00000082 edef5e50 edef5da4 c010dfe0 c03c6cb0 f7a9cb80 e089c860
00000620 c192f240 c0268b62 d7c83812 d7c83812 d7c83812 00000000 00000246
f7fa3190 c1815be0 000024d4 56c3dc45 0001da75 f70aa9e0 f70ab980 f70aa810
Call Trace:
[<c010dfe0>] do_gettimeofday+0x20/0xc0
[<c0268b62>] alloc_skb+0x32/0xd0
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c02c090a>] unix_stream_data_wait+0xfa/0x180
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0107365>] need_resched+0x27/0x32
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c02c1003>] unix_stream_recvmsg+0x673/0x710
[<c013dc64>] file_read_actor+0xc4/0xd0
[<c01aa31f>] reiserfs_file_write+0x7ff/0x810
[<c0265030>] sock_aio_read+0xb0/0xd0
[<c0160bcd>] do_sync_read+0x6d/0xb0
[<c01f556e>] release_dev+0x33e/0x7e0
[<c015fdec>] dentry_open+0x14c/0x220
[<c015fc8f>] filp_open+0x4f/0x60
[<c0160cf7>] vfs_read+0xe7/0x110
[<c0161d39>] __fput+0xb9/0x120
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sshd D 00000000 0 3759 3756 3760 (NOTLB)
d49bfd18 00000086 00000000 00000000 00000000 00000000 c1fa6700 00000000
00000000 00000000 00000000 00000000 d49be000 00000000 00000246 00000000
ffffffff c1815be0 00000bad 1ca15353 000222b0 f7c9ce50 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c01fc19a>] pty_chars_in_buffer+0x1a/0x40
[<c01fc175>] pty_write_room+0x25/0x30
[<c0175a04>] poll_freewait+0x44/0x50
[<c0175dc7>] do_select+0x1e7/0x330
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c017007b>] get_write_access+0x4b/0xe0
[<c01cf8e0>] __copy_to_user_ll+0x40/0x60
[<c01762e7>] sys_select+0x3b7/0x4d0
[<c0160f78>] sys_write+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
bash S 00000246 0 3760 3759 (NOTLB)
e7045e58 00000086 c030d300 00000246 01468f60 c0141c4f f7cd84c0 db8215b4
c030d680 c12446c0 00000000 00000000 00000082 c1914c00 d20a9000 e7045e94
e7045e6c c180dbe0 0008cdf5 57ea2b52 0001db26 f714fb30 c013ce05 c011a771
Call Trace:
[<c0141c4f>] buffered_rmqueue+0x10f/0x280
[<c013ce05>] find_get_page+0x35/0xc0
[<c011a771>] __wake_up_common+0x31/0x60
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c01f81b4>] opost_block+0xf4/0x1b0
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sshd S D806FDA4 0 3914 3364 3917 3756 (NOTLB)
d806fd7c 00000086 d806fe50 d806fda4 00000000 d806fe50 f7a6adc0 00000000
00047c9c 00000000 1ec86840 d806fd5c f7aae140 d5d0fb1e 02036e86 d19acde0
d19ace00 c180dbe0 0000201b 11d21020 000203d5 eff20e10 c180dbe0 d806fd98
Call Trace:
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0118397>] recalc_task_prio+0x97/0x1c0
[<c02c090a>] unix_stream_data_wait+0xfa/0x180
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4b1>] autoremove_wake_function+0x11/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0268b62>] alloc_skb+0x32/0xd0
[<c02c1003>] unix_stream_recvmsg+0x673/0x710
[<c0265030>] sock_aio_read+0xb0/0xd0
[<c0160bcd>] do_sync_read+0x6d/0xb0
[<c0129026>] update_wall_time+0x16/0x40
[<c0160cf7>] vfs_read+0xe7/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sshd D 00000000 0 3917 3914 3918 (NOTLB)
f682dd18 00000000 00000000 00000000 00000000 f7a9c040 00000000 00000000 00000000 00000000 f682c000 00000000 00000246 00000000
ffffffff c180dbe0 000014af 982fad90 000222af f7aae310 00000000 c030dc20
Call Trace:
[<c0129642>] schedule_timeout+0x72/0xd0
[<c01295c0>] process_timeout+0x0/0x10
[<c011bfa8>] io_schedule_timeout+0x28/0x40
[<c020e8ab>] blk_congestion_wait+0x7b/0x90
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c014205b>] __alloc_pages+0x29b/0x330
[<c01cf8f9>] __copy_to_user_ll+0x59/0x60
[<c015bc11>] read_swap_cache_async+0x101/0x10d
[<c014f79f>] swapin_readahead+0x2f/0xd0
[<c014fb57>] do_swap_page+0x317/0x430
[<c014d835>] pte_alloc_map+0xc5/0x130
[<c01507f8>] handle_mm_fault+0xc8/0x1d0
[<c0117444>] do_page_fault+0x304/0x4ef
[<c01fc19a>] pty_chars_in_buffer+0x1a/0x40
[<c01fc175>] pty_write_room+0x25/0x30
[<c0175a04>] poll_freewait+0x44/0x50
[<c0175dc7>] do_select+0x1e7/0x330
[<c0117140>] do_page_fault+0x0/0x4ef
[<c0107e85>] error_code+0x2d/0x38
[<c017007b>] get_write_access+0x4b/0xe0
[<c01cf8e0>] __copy_to_user_ll+0x40/0x60
[<c01762e7>] sys_select+0x3b7/0x4d0
[<c0114246>] smp_apic_timer_interrupt+0xd6/0x140
[<c01073c9>] sysenter_past_esp+0x52/0x71
bash S 00000246 0 3918 3917 (NOTLB)
ef5dde58 00000082 c030d780 00000246 c030d780 c0141c4f f7bd5040 db8215b4
c030db00 c17735c0 00000000 00000000 0000038e c1914c00 eff81000 d19ac810
d19ac830 c180dbe0 000cb843 74fcf82c 0001dc80 f7a40940 c013ce05 00000000
Call Trace:
[<c0141c4f>] buffered_rmqueue+0x10f/0x280
[<c013ce05>] find_get_page+0x35/0xc0
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c01f81b4>] opost_block+0xf4/0x1b0
[<c01facaa>] read_chan+0x96a/0xb00
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c01f46dd>] tty_write+0x1ad/0x360
[<c01f44f6>] tty_read+0x176/0x1b0
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
pdflush S 00000000 0 3951 6 24 (L-TLB)
c268df78 00000046 00000000 00000000 00000000 00000000 f7a4d940 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 f73d07d0
f73d07f0 c1815be0 00000110 29477494 000222b0 f7bb28a0 00000000 00000000
Call Trace:
[<c0144105>] __pdflush+0xd5/0x380
[<c011a771>] __wake_up_common+0x31/0x60
[<c01443b0>] pdflush+0x0/0x10
[<c01443ba>] pdflush+0xa/0x10
[<c01443b0>] pdflush+0x0/0x10
[<c0135e94>] kthread+0xa4/0xb0
[<c0135df0>] kthread+0x0/0xb0
[<c0104ec5>] kernel_thread_helper+0x5/0x10
pdflush S 00000000 0 6583 7 (L-TLB)
c4ae1f78 00000046 00000000 00000000 00000000 00000000 f7a9c040 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 c180dbe0 00000285 9884aecd 000222af eff20270 00000000 00000000
Call Trace:
[<c0144105>] __pdflush+0xd5/0x380
[<c011a771>] __wake_up_common+0x31/0x60
[<c01443b0>] pdflush+0x0/0x10
[<c01443ba>] pdflush+0xa/0x10
[<c01443b0>] pdflush+0x0/0x10
[<c0135e94>] kthread+0xa4/0xb0
[<c0135df0>] kthread+0x0/0xb0
[<c0104ec5>] kernel_thread_helper+0x5/0x10
cron S C030D300 0 6677 3391 6678 6760 (NOTLB)
f4233ed8 00000082 c0141e77 c030d300 00000010 00000001 e45efb80 d19ac240
f7b074c0 f7a85f0c c011a2c9 f4233f04 00000082 d19ac240 00000010 e7892280
e78922a0 c1815be0 0000eff8 924d6713 00020568 d19ac410 f7fffaa0 f7a62180
Call Trace:
[<c0141e77>] __alloc_pages+0xb7/0x330
[<c011a2c9>] schedule+0x389/0x7a0
[<c016ee7c>] pipe_wait+0x7c/0xa0
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c01cf8f9>] __copy_to_user_ll+0x59/0x60
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c016f07a>] pipe_readv+0x1da/0x2c0
[<c016f180>] pipe_read+0x20/0x30
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sh S E0B63080 0 6678 6677 6679 (NOTLB)
caa4df48 c01508a3 e78483bc e45ef940 2c5fa065
c011db30 f6f71380 e45ef940 e45ef960 f6f71380 d19ad3b0 c0117444 f73d0da0
f73d0dc0 c1815be0 0002185a 91df5d00 00020568 d19ad580 f1499544 00000001
Call Trace:
[<c01508a3>] handle_mm_fault+0x173/0x1d0
[<c011db30>] copy_mm+0x250/0x570
[<c0117444>] do_page_fault+0x304/0x4ef
[<c012354b>] sys_wait4+0x1bb/0x280
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c0123635>] sys_waitpid+0x25/0x29
[<c01073c9>] sysenter_past_esp+0x52/0x71
daily_reports S F0D7B080 0 6679 6678 6689 (NOTLB)
e9afbf48 00000086 080f1e34 f0d7b080 c01508a3 edc523c4 f7cd8700 347c9065
c011db30 f6f71770 f7cd8700 f7cd8720 f6f71770 d19ac810 c0117444 00000001
aea87c72 c180dbe0 00016174 5e27e5ad 0002057f d19ac9e0 f712a584 00000000
Call Trace:
[<c01508a3>] handle_mm_fault+0x173/0x1d0
[<c011db30>] copy_mm+0x250/0x570
[<c0117444>] do_page_fault+0x304/0x4ef
[<c012354b>] sys_wait4+0x1bb/0x280
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c0123635>] sys_waitpid+0x25/0x29
[<c01073c9>] sysenter_past_esp+0x52/0x71
mysql S C015081C 0 6689 6679 (NOTLB)
f5fe5d7c 00000086 d57ed400 c015081c 00000001 f4182998 f7d66dc0 c01bebe9
ccadf818 f7d66dc0 f7d66de0 ccadf818 f73d07d0 603d05d3 0002057f f73d07d0
f73d07f0 c1815be0 000021e1 605a5570 0002057f e78935c0 00000000 f5fe5df8
Call Trace:
[<c015081c>] handle_mm_fault+0xec/0x1d0
[<c01bebe9>] journal_mark_dirty+0x159/0x2e0
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0141e77>] __alloc_pages+0xb7/0x330
[<c011d4b1>] autoremove_wake_function+0x11/0x40
[<c02c090a>] unix_stream_data_wait+0xfa/0x180
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c0268b62>] alloc_skb+0x32/0xd0
[<c02c1003>] unix_stream_recvmsg+0x673/0x710
[<c0265030>] sock_aio_read+0xb0/0xd0
[<c0160bcd>] do_sync_read+0x6d/0xb0
[<c014e246>] unmap_vmas+0xf6/0x310
[<c01488ff>] __pagevec_lru_add_active+0x13f/0x1b0
[<c012eb45>] sys_rt_sigaction+0xd5/0xf0
[<c0160cf7>] vfs_read+0xe7/0x110
[<c017489b>] do_fcntl+0x11b/0x1d0
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
cron S C030D300 0 6760 3391 6761 6677 (NOTLB)
eb401ed8 00000082 c0141e77 c030d300 00000010 00000001 f7a4d040 e7892280
f7a4d040 d4323c28 c011a2c9 eb401f04 00000082 e7892280 00000010 c1962c20
c1962c40 c1815be0 0000d409 4121cbab 0002062c e7892450 f7fffaa0 c1962c20
Call Trace:
[<c0141e77>] __alloc_pages+0xb7/0x330
[<c011a2c9>] schedule+0x389/0x7a0
[<c016ee7c>] pipe_wait+0x7c/0xa0
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c01cf8f9>] __copy_to_user_ll+0x59/0x60
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c016f07a>] pipe_readv+0x1da/0x2c0
[<c016f180>] pipe_read+0x20/0x30
[<c0160cc0>] vfs_read+0xb0/0x110
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
sh S F7B97860 0 6761 6760 6763 (NOTLB)
d4323f48 00000086 0002062c f7b97860 f7b97880 c1815be0 f7a4d4c0 4190165b
0002062c c1962df4 d4323f7c d4322000 d4322000 d4323f7c d4322000 e8af8160
e8af8180 c1815be0 000015ca 41908132 0002062c c1962df0 f7bf5fa4 00000001
Call Trace:
[<c012354b>] sys_wait4+0x1bb/0x280
[<c011a730>] default_wake_function+0x0/0x10
[<c011a730>] default_wake_function+0x0/0x10
[<c0123635>] sys_waitpid+0x25/0x29
[<c01073c9>] sysenter_past_esp+0x52/0x71
php S D2527DA4 0 6763 6761 (NOTLB)
d2527d7c 00000082 d2527e50 d2527da4 00000000 d2527e50 c1fa6280 00000000
00008c65 00000000 36d5b0c8 00000000 f70ab3b0 c10c0900 e5d5b060 f70ab3b0
f70ab3d0 c180dbe0 00002530 8a40e181 0002062c e8af8ed0 00000000 f7dacc00
Call Trace:
[<c0129693>] schedule_timeout+0xc3/0xd0
[<c0118397>] recalc_task_prio+0x97/0x1c0
[<c02c090a>] unix_stream_data_wait+0xfa/0x180
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c011d4b1>] autoremove_wake_function+0x11/0x40
[<c011d4a0>] autoremove_wake_function+0x0/0x40
[<c02c1003>] unix_stream_recvmsg+0x673/0x710
[<c0265030>] sock_aio_read+0xb0/0xd0
[<c0160bcd>] do_sync_read+0x6d/0xb0
[<c0160cf7>] vfs_read+0xe7/0x110
[<c017489b>] do_fcntl+0x11b/0x1d0
[<c0160f18>] sys_read+0x38/0x60
[<c01073c9>] sysenter_past_esp+0x52/0x71
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.X kernel memory leak?
2004-04-09 7:17 ` 2.6.X kernel memory leak? Sergey S. Kostyliov
@ 2004-04-09 9:09 ` Andrew Morton
2004-04-09 12:15 ` Sergey S. Kostyliov
0 siblings, 1 reply; 24+ messages in thread
From: Andrew Morton @ 2004-04-09 9:09 UTC (permalink / raw)
To: Sergey S. Kostyliov; +Cc: linux-kernel, anton
"Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
>
> And here is part of sysrq-T for the third machine, which have just locked up,
> kernel is 2.6.5-rc3-aa2.
It does look like a kernel memory leak, but it's not into slab.
You've disabled iptables. Possibly there's a leak in a device driver?
Which drivers are in regular use there? What are you using for those
hardware RAID controllers?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.X kernel memory leak?
2004-04-09 9:09 ` Andrew Morton
@ 2004-04-09 12:15 ` Sergey S. Kostyliov
0 siblings, 0 replies; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-04-09 12:15 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, anton
Hello Andrew,
On Friday 09 April 2004 13:09, Andrew Morton wrote:
> "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote:
> >
> > And here is part of sysrq-T for the third machine, which have just locked up,
> > kernel is 2.6.5-rc3-aa2.
>
> It does look like a kernel memory leak, but it's not into slab.
>
> You've disabled iptables. Possibly there's a leak in a device driver?
> Which drivers are in regular use there? What are you using for those
> hardware RAID controllers?
I've seen this kind of lockup (according to sysrq-T) on different boxes:
1) ope
RAID: mylex 352
drivers: e100, dac960
.config: http://sysadminday.org.ru/2.6.1-io_lockup/ope/.config
2) terror
RAID: megaraid 320-2
drivers: e1000, megaraid2
.config: http://sysadminday.org.ru/2.6.X-lockup/terror/.config
3) mirror
drivers: e100, aic7xxx, md, netconsole
.config: http://sysadminday.org.ru/2.6.X-lockup/mirror/.config
I also saw the same symptoms on a fourth box, but I'm not shure about
this one because it didn't use to be attached to serial console at that time.
For this box:
RAID: Compaq smart 2
drivers: tlan,epic100,cpqarray
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2004-04-09 12:19 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-31 16:40 2.6.1 IO lockup on SMP systems Sergey S. Kostyliov
2004-02-01 0:17 ` Andrew Morton
2004-02-21 16:45 ` Sergey S. Kostyliov
2004-02-21 19:30 ` Andrew Morton
2004-02-22 17:39 ` Alexander Y. Fomichev
2004-02-23 17:27 ` Sergey S. Kostyliov
2004-02-23 21:30 ` Mike Fedyk
2004-02-24 11:56 ` Sergey S. Kostyliov
2004-02-23 22:26 ` Andrew Morton
2004-02-24 7:23 ` Marcelo Tosatti
2004-02-24 6:53 ` Andrew Morton
2004-02-24 11:54 ` Sergey S. Kostyliov
2004-02-26 12:19 ` Sergey S. Kostyliov
2004-02-26 12:53 ` Andrew Morton
2004-02-26 13:11 ` Andrew Morton
2004-02-26 14:37 ` Dave Jones
2004-02-26 15:37 ` Arjan van de Ven
2004-02-26 14:30 ` Sergey S. Kostyliov
2004-02-26 20:03 ` Andrew Morton
2004-02-28 14:56 ` Sergey S. Kostyliov
2004-04-08 9:08 ` 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) Sergey S. Kostyliov
2004-04-09 7:17 ` 2.6.X kernel memory leak? Sergey S. Kostyliov
2004-04-09 9:09 ` Andrew Morton
2004-04-09 12:15 ` Sergey S. Kostyliov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox