* 2.6.1 IO lockup on SMP systems
@ 2004-01-31 16:40 Sergey S. Kostyliov
2004-02-01 0:17 ` Andrew Morton
0 siblings, 1 reply; 24+ messages in thread
From: Sergey S. Kostyliov @ 2004-01-31 16:40 UTC (permalink / raw)
To: linux-kernel; +Cc: anton
Hello all,
I had experienced a lockups on three of my servers with 2.6.1. It doesn't
look like a deadlock, the box is still pingable and all tcp ports which were
in listen state before a lockup are remains in listen state, but I can't get
any data from this ports. According to sar(1) systems had not been overloaded
right before a lockup. And there is no log entries in all user services logs
for almost 10 hours after lockup.
So I think this is an IO lockup. On the other side it doesn't look like a bug
in particular controller driver, because they are different for each box.
And finally it doesn't look like a bug in particular io-scheduler because two
of boxes were runed with "deadline" and one with "as". Of course all
assumptions are valid only if all lockups I had seen have the same nature.
All of three boxes are SMP. Unfortunately all are remote and aren't attached
to a serial console yet (this is planed in next couple of weeks).
1) ope
01:02.1 RAID bus controller: Mylex Corporation: Unknown device 0050 (rev 02)
elevator=deadline
.config: http://sysadminday.org.ru/2.6.1-io_lockup/ope/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/ope/lspci_-vvn
2) white
02:04.0 RAID bus controller: American Megatrends Inc. MegaRAID (rev 02)
elevator=deadline
.config: http://sysadminday.org.ru/2.6.1-io_lockup/white/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/white/lspci_-vvn
3) tiny
02:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
03:00.0 Unknown mass storage controller: Compaq Computer Corporation Smart-2/P RAID Controller (rev 03)
elevator=as
.config: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/.config
lspci: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci
lspci -vvn: http://sysadminday.org.ru/2.6.1-io_lockup/tiny/lspci_-vvn
Any hints will be appreciated.
--
Best regards,
Sergey S. Kostyliov <rathamahata@php4.ru>
Public PGP key: http://sysadminday.org.ru/rathamahata.asc
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: 2.6.1 IO lockup on SMP systems 2004-01-31 16:40 2.6.1 IO lockup on SMP systems Sergey S. Kostyliov @ 2004-02-01 0:17 ` Andrew Morton 2004-02-21 16:45 ` Sergey S. Kostyliov 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2004-02-01 0:17 UTC (permalink / raw) To: Sergey S. Kostyliov; +Cc: linux-kernel, anton "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > I had experienced a lockups on three of my servers with 2.6.1. It doesn't > look like a deadlock, the box is still pingable and all tcp ports which were > in listen state before a lockup are remains in listen state, but I can't get > any data from this ports. According to sar(1) systems had not been overloaded > right before a lockup. And there is no log entries in all user services logs > for almost 10 hours after lockup. Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks backtrace or a locked machine with sysrq-T or `echo t > /proc/sysrq-trigger'. Then send us the resulting trace. You may need a serial console to be able to capture all the output. Also, it would be useful to know what sort of load the machines are under, and what filesystems are in use. Thanks. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-01 0:17 ` Andrew Morton @ 2004-02-21 16:45 ` Sergey S. Kostyliov 2004-02-21 19:30 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-21 16:45 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, anton Hello Andrew, On Sunday 01 February 2004 03:17, Andrew Morton wrote: > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > I had experienced a lockups on three of my servers with 2.6.1. It doesn't > > look like a deadlock, the box is still pingable and all tcp ports which were > > in listen state before a lockup are remains in listen state, but I can't get > > any data from this ports. According to sar(1) systems had not been overloaded > > right before a lockup. And there is no log entries in all user services logs > > for almost 10 hours after lockup. > > Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks > backtrace or a locked machine with sysrq-T or `echo t > > /proc/sysrq-trigger'. Then send us the resulting trace. I've just reproduced this lockup with 2.6.3. > > You may need a serial console to be able to capture all the output. > > Also, it would be useful to know what sort of load the machines are under, > and what filesystems are in use. The machine is a http server. The main applications are: 1) apache 1.3 which serves php pages (mod_php): 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request 54 requests currently being processed, 19 idle servers 2) mysql: Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980 Flush tables: 1 Open tables: 630 Queries per second avg: 143.547 This is an IO bound machine in general. All filesystems are reiserfs. Here is a sysrq-T output obtained from a locked box via serail console: SysRq : Show State free sibling task PC stack pid father child younger older init D 28E916FC 24 1 0 2 (NOTLB) c244fcf0 00000086 d8460080 28e916fc 00003243 c2422bc0 f77fbd00 00000096 d8460080 2ede4081 00003243 c02af980 00000001 2ede4181 00003243 d8460080 d84600a0 c2422bc0 000017a2 2ede43e1 00003243 c244dac8 03471525 c244fd04 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c013e6a9>] filemap_nopage+0x329/0x3d0 [<c0157728>] read_swap_cache_async+0xb8/0xd0 [<c014c903>] swapin_readahead+0x43/0x90 [<c014cb98>] do_swap_page+0x248/0x320 [<c014d4d0>] handle_mm_fault+0xe0/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0171334>] sys_select+0x264/0x520 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 migration/0 S 00000001 12 2 1 3 (L-TLB) c245dfc4 00000046 c241abc0 00000001 00000003 c245df98 c02ab560 f7da6d00 69378bf7 247a42e0 00000000 e11a4de2 f77e1f58 c245c000 f77e1f50 00000292 c245dfc4 c241abc0 00001801 e11a54a6 00000001 c244ce68 c241b4ec c245c000 Call Trace: [<c011f4af>] migration_thread+0xdf/0x160 [<c011f3d0>] migration_thread+0x0/0x160 [<c0106f79>] kernel_thread_helper+0x5/0xc ksoftirqd/0 S 00000001 24 3 1 4 2 (L-TLB) c245bfd8 00000046 c241abc0 00000001 00000003 f3fc8ca8 f63f62e0 c241c54c c245bf94 c245bf94 c241c55c 00000000 cebbd940 253f991d 000019b1 f5f766d0 f5f766f0 c241abc0 0000010b 8d075105 000019ea c244c838 c245a000 c245a000 Call Trace: [<c0126d22>] ksoftirqd+0xe2/0x100 [<c0126c40>] ksoftirqd+0x0/0x100 [<c0106f79>] kernel_thread_helper+0x5/0xc migration/1 S 00000001 8 4 1 5 3 (L-TLB) c2459fc4 00000046 c2422bc0 00000001 00000003 c02aeedc c02ab560 c0336c60 c012bb70 c02aeedc c02aeed8 c2458000 c0123735 00000082 c02aba30 00000008 c2458000 c2422bc0 00004b63 0295e0d2 00000000 c244c208 c24234ec c2458000 Call Trace: [<c012bb70>] free_uid+0x20/0x90 [<c0123735>] reparent_to_init+0x105/0x1a0 [<c011f4af>] migration_thread+0xdf/0x160 [<c011f3d0>] migration_thread+0x0/0x160 [<c0106f79>] kernel_thread_helper+0x5/0xc ksoftirqd/1 S C0134355 24 5 1 6 4 (L-TLB) c2455fd8 00000046 c03385e0 c0134355 02002bfd cad564c8 ed2ed0e0 c242454c c2455f94 c2455f94 c242455c 00000000 c2454000 c033759c c0126a03 eca2f350 eca2f370 c2422bc0 0000025a 31392d56 000019e4 c2457ae8 c2454000 c2454000 Call Trace: [<c0134355>] rcu_process_callbacks+0x155/0x190 [<c0126a03>] tasklet_action+0x73/0xe0 [<c0126d22>] ksoftirqd+0xe2/0x100 [<c0126c40>] ksoftirqd+0x0/0x100 [<c0106f79>] kernel_thread_helper+0x5/0xc events/0 S 00000001 0 6 1 14588 7 5 (L-TLB) f7f93f70 00000046 c241abc0 00000001 00000003 0000000b f77fb8c0 c02b8124 00000246 c241b520 c0353e40 00000000 f7fcbbe4 f7f92000 f7fcbbe0 00000092 f7f93f70 c241abc0 000001c9 25a4fd1b 00003243 c24574b8 f7f92000 f7fcbbcc Call Trace: [<c01333e5>] worker_thread+0x285/0x2b0 [<c01e5a60>] console_callback+0x0/0xe0 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc events/1 S ED2ED520 0 7 1 8 6 (L-TLB) f7f91f70 00000046 00000000 ed2ed520 00000000 e07e9e88 ed2ed520 00000000 00000000 f630a080 0000007b 0000007b f630a080 f630a0a0 c2422bc0 f6258d20 f6258d40 c2422bc0 0000006b ecdabbfb 000019ef c2456e88 f7f90000 f7fcbc2c Call Trace: [<c01333e5>] worker_thread+0x285/0x2b0 [<c0132e00>] __call_usermodehelper+0x0/0x70 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc kblockd/0 S 00000001 24 8 1 9 7 (L-TLB) c2527f70 00000046 c241abc0 00000001 00000003 00000001 f776f2a0 f7fa8000 c02027ec c2772e00 f3dbde28 f7c37834 f7fcb3a4 c2526000 f7fcb3a0 00000092 c2527f70 c241abc0 0000067d 03fc798c 00002b4f c2456858 c2526000 f7fcb38c Call Trace: [<c02027ec>] DAC960_process_queue+0x1c/0x170 [<c01333e5>] worker_thread+0x285/0x2b0 [<c01f4670>] blk_unplug_work+0x0/0x20 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc kblockd/1 S C2772E00 8 9 1 13 8 (L-TLB) c2525f70 00000046 c01f29d6 c2772e00 00000003 00000003 d1a49ae0 f7fa8000 c02027ec c2772e00 ecffc5d8 c2763c60 f7fcb404 c2524000 f7fcb400 c25026b0 c25026d0 c2422bc0 00000961 ff665670 00002b4e c2456228 c2524000 f7fcb3ec Call Trace: [<c01f29d6>] elv_next_request+0x16/0x110 [<c02027ec>] DAC960_process_queue+0x1c/0x170 [<c01333e5>] worker_thread+0x285/0x2b0 [<c01f4670>] blk_unplug_work+0x0/0x20 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc kswapd0 S C23FFE98 0 13 1 10 9 (L-TLB) f7dfbf04 00000046 c23fff38 c23ffe98 000000d0 00000200 f77a06c0 c02b0280 00000002 00000000 c0149200 00000100 c02b0280 000000d0 00000200 f72ecce0 f72ecd00 c2422bc0 0000b6a2 df9ac558 00003243 c2502878 f7dfa000 f7dfbf20 Call Trace: [<c0149200>] balance_pgdat+0x1c0/0x250 [<c014939b>] kswapd+0x10b/0x160 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0109192>] ret_from_fork+0x6/0x14 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0149290>] kswapd+0x0/0x160 [<c0106f79>] kernel_thread_helper+0x5/0xc kirqd S 00000001 8 10 1 14 13 (L-TLB) c2501fa0 00000046 c2422bc0 00000001 00000003 00000000 d1a49040 00000000 c0109c28 00000000 000000d5 005d2025 c244d2d0 4926873b 03471a9a f77ac6f0 f77ac710 c2422bc0 000006fd 881c4eb1 00003243 c2503b08 03472e23 c2501fb4 Call Trace: [<c0109c28>] common_interrupt+0x18/0x20 [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c012b5b0>] process_timeout+0x0/0x10 [<c0118ee7>] balanced_irq+0x57/0x80 [<c0118e90>] balanced_irq+0x0/0x80 [<c0106f79>] kernel_thread_helper+0x5/0xc aio/0 S 00000082 0 14 1 15 10 (L-TLB) f7da9f70 00000046 00000001 00000082 00000001 c244ff68 c02ab560 f7da9f4c c011d93a c244d900 00000003 00000000 c244ff68 f7da8000 00010000 c244d900 c244d920 c241abc0 000027fb 1965fa0c 00000000 c2502248 f7da8000 00000000 Call Trace: [<c011d93a>] __wake_up_common+0x3a/0x70 [<c01333e5>] worker_thread+0x285/0x2b0 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc aio/1 S 00000001 0 15 1 16 14 (L-TLB) f7da5f70 00000046 c2422bc0 00000001 00000003 c244ff68 c02ab560 f7da5f4c c011d93a c244d900 00000003 00000000 c244ff68 f7da4000 00010000 f7da7960 f7dd7c04 c2422bc0 0000241f 19668d09 00000000 f7da7b28 f7da4000 00000000 Call Trace: [<c011d93a>] __wake_up_common+0x3a/0x70 [<c01333e5>] worker_thread+0x285/0x2b0 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc kseriod S 00000001 1688 16 1 17 15 (L-TLB) c26c9fb0 00000046 c2422bc0 00000001 00000003 e2024870 f68ad760 00000002 e2024000 012aeedc e20248b0 c02be660 c02be8a0 c02be820 00000286 c021ab1a c02be8a0 c2422bc0 018bda82 3f31580d 0000317e f7da6268 c26c8000 ffffe000 Call Trace: [<c021ab1a>] serio_find_dev+0x6a/0x70 [<c021adb6>] serio_thread+0x146/0x180 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c021ac70>] serio_thread+0x0/0x180 [<c0106f79>] kernel_thread_helper+0x5/0xc reiserfs/0 S 00000003 0 17 1 18 16 (L-TLB) c2697f70 00000046 f880b38c 00000003 00000001 00000000 ecec66e0 f880b398 f880b34c 00000292 c01b824f f8831c20 c26dce44 c2696000 c26dce40 cdc9aca0 cdc9acc0 c241abc0 00001bcf c3d3faeb 00001a46 f7da6898 c2696000 c26dce2c Call Trace: [<c01b824f>] kupdate_one_transaction+0x12f/0x250 [<c01333e5>] worker_thread+0x285/0x2b0 [<c01b97c0>] reiserfs_journal_commit_task_func+0x0/0x100 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc reiserfs/1 S 00000003 0 18 1 23 17 (L-TLB) f77e1f70 00000046 f8a272ec 00000003 00000001 00000000 f776fb20 00000000 f8a272ac f77a11f0 c01b824f f77a11f0 c26dcea4 f77e0000 c26dcea0 f38aace0 f38aad00 c2422bc0 00000448 e234458a 00001a55 f7da6ec8 f77e0000 c26dce8c Call Trace: [<c01b824f>] kupdate_one_transaction+0x12f/0x250 [<c01333e5>] worker_thread+0x285/0x2b0 [<c01b97c0>] reiserfs_journal_commit_task_func+0x0/0x100 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0109192>] ret_from_fork+0x6/0x14 [<c011d8e0>] default_wake_function+0x0/0x20 [<c0133160>] worker_thread+0x0/0x2b0 [<c0106f79>] kernel_thread_helper+0x5/0xc devfsd D 25935C12 16 23 1 610 18 (NOTLB) f7683bcc 00000086 f5f6b980 25935c12 00003243 c241abc0 f77fb8c0 00000096 f5f6b980 25935ac8 00003243 c02af980 00000001 25935c12 00003243 f5f6b980 f5f6b9a0 c241abc0 00002372 25935f22 00003243 f7757538 03471489 f7683be0 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0157728>] read_swap_cache_async+0xb8/0xd0 [<c014c903>] swapin_readahead+0x43/0x90 [<c014cb98>] do_swap_page+0x248/0x320 [<c014d4d0>] handle_mm_fault+0xe0/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0141740>] __alloc_pages+0xa0/0x350 [<c011b858>] recalc_task_prio+0xa8/0x1d0 [<c011d503>] schedule+0x373/0x700 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 [<c01cc1f6>] __copy_to_user_ll+0x46/0x80 [<c01c161e>] devfsd_read+0x42e/0x4e0 [<c011d8e0>] default_wake_function+0x0/0x20 [<c011d8e0>] default_wake_function+0x0/0x20 [<c015c4a8>] vfs_read+0xb8/0x130 [<c015c752>] sys_read+0x42/0x70 [<c01092bb>] syscall_call+0x7/0xb syslogd D 00000001 0 610 1 616 23 (NOTLB) f71cdcf0 00000086 c241abc0 00000001 00000003 c2422bc0 f77a04a0 00000096 f5f6b980 24068d22 00003243 c02af980 00000001 00000096 f71cc000 f71cc000 f71cdd04 c241abc0 00001cf0 2406959c 00003243 f7da74f8 0347146f f71cdd04 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c013e6a9>] filemap_nopage+0x329/0x3d0 [<c0157728>] read_swap_cache_async+0xb8/0xd0 [<c014c903>] swapin_readahead+0x43/0x90 [<c014cb98>] do_swap_page+0x248/0x320 [<c014d4d0>] handle_mm_fault+0xe0/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0125b1e>] do_setitimer+0x1be/0x1f0 [<c01086c0>] sys_sigreturn+0xf0/0x110 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 klogd D B043E3E1 0 616 1 699 610 (NOTLB) f7769cf0 00000086 d8460080 b6390e66 00003244 c2422bc0 f7306d60 00000096 d8460080 b6390d6f 00003244 c02af980 00000001 b6390e66 00003244 d8460080 d84600a0 c2422bc0 00001736 bc2e3c36 00003244 f7628838 03472f32 f7769d04 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0157728>] read_swap_cache_async+0xb8/0xd0 [<c014c903>] swapin_readahead+0x43/0x90 [<c014cb98>] do_swap_page+0x248/0x320 [<c014d4d0>] handle_mm_fault+0xe0/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c011d8e0>] default_wake_function+0x0/0x20 [<c012b4b0>] do_timer+0xc0/0xd0 [<c015c4c2>] vfs_read+0xd2/0x130 [<c015c752>] sys_read+0x42/0x70 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 ntpd S 00000001 16 699 1 718 616 (NOTLB) f735deb0 00000086 c2422bc0 00000001 00000003 00000000 f77a06c0 c02afe00 00000000 00000000 08098478 00000000 f72ecce0 00000010 c02b0700 00000000 000000d0 c2422bc0 00000260 ce28bcca 00003244 f72ecea8 00000000 7fffffff Call Trace: [<c012b67e>] schedule_timeout+0xbe/0xc0 [<c022485b>] datagram_poll+0x2b/0xca [<c021e809>] sock_poll+0x29/0x40 [<c0170f21>] do_select+0x1a1/0x310 [<c0170bb0>] __pollwait+0x0/0xd0 [<c01713cb>] sys_select+0x2fb/0x520 [<c01092bb>] syscall_call+0x7/0xb sshd D EBC82985 0 718 1 1051 741 699 (NOTLB) f77bfc84 00000082 d8460080 ebc82985 00003244 c2422bc0 f77fb040 00000082 d8460080 ebc82884 00003244 c02af980 00000001 f1bd4c48 00003244 d8460080 d84600a0 c2422bc0 00001749 f1bd4e95 00003244 f71f3b28 034732b5 f77bfc98 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0143bc2>] do_page_cache_readahead+0x172/0x1e0 [<c013e511>] filemap_nopage+0x191/0x3d0 [<c013e380>] filemap_nopage+0x0/0x3d0 [<c014cfd3>] do_no_page+0xd3/0x3c0 [<c014acc7>] pte_alloc_map+0xc7/0x110 [<c014d4f6>] handle_mm_fault+0x106/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0171334>] sys_select+0x264/0x520 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 xinetd D 00000001 0 741 1 758 718 (NOTLB) f7627c84 00000086 c241abc0 00000001 00000003 f7626000 f77fb6a0 212e541d 00003243 f5f6b980 f5f6b9a0 c241abc0 00015dbd 212e559d 00003243 f7626000 f7627c98 c241abc0 000004c7 212e61bf 00003243 f72ff4b8 03471440 f7627c98 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0143bc2>] do_page_cache_readahead+0x172/0x1e0 [<c01f5d56>] generic_make_request+0x106/0x190 [<c013e511>] filemap_nopage+0x191/0x3d0 [<c013e380>] filemap_nopage+0x0/0x3d0 [<c014cfd3>] do_no_page+0xd3/0x3c0 [<c014acc7>] pte_alloc_map+0xc7/0x110 [<c014d4f6>] handle_mm_fault+0x106/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c01cc1fa>] __copy_to_user_ll+0x4a/0x80 [<c0171334>] sys_select+0x264/0x520 [<c011d8e0>] default_wake_function+0x0/0x20 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 svscan D 00000001 0 758 1 759 785 741 (NOTLB) f736fcf0 00000082 c241abc0 00000001 00000003 c2422bc0 f776f080 00000096 d8460080 24345b50 00003243 c02af980 00000001 00000096 f736e000 f736e000 f736fd04 c241abc0 00001e2b 24346317 00003243 f72ec878 03471472 f736fd04 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c013e6a9>] filemap_nopage+0x329/0x3d0 [<c0157728>] read_swap_cache_async+0xb8/0xd0 [<c014c903>] swapin_readahead+0x43/0x90 [<c014cb98>] do_swap_page+0x248/0x320 [<c014d4d0>] handle_mm_fault+0xe0/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c012b636>] schedule_timeout+0x76/0xc0 [<c013027e>] sys_rt_sigaction+0xfe/0x120 [<c012b5b0>] process_timeout+0x0/0x10 [<c012b85e>] sys_nanosleep+0x10e/0x1c0 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 supervise D 00000001 0 759 758 761 760 (NOTLB) f6e4fcf0 00000086 c2422bc0 00000001 00000003 f6e4e000 f77a0060 606e7502 00003245 d8460080 d84600a0 c2422bc0 0000dc5d 606e7682 00003245 f6e4e000 f6e4fd04 c2422bc0 00000496 6672deb3 00003245 f72ffae8 03473a5b f6e4fd04 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c013e6a9>] filemap_nopage+0x329/0x3d0 [<c0157728>] read_swap_cache_async+0xb8/0xd0 [<c014c903>] swapin_readahead+0x43/0x90 [<c014cb98>] do_swap_page+0x248/0x320 [<c014d4d0>] handle_mm_fault+0xe0/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0170ba4>] poll_freewait+0x44/0x50 [<c01719d2>] sys_poll+0x272/0x2c0 [<c0170bb0>] __pollwait+0x0/0xd0 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 supervise D 00000001 0 760 758 762 759 (NOTLB) f6e4dc84 00000082 c241abc0 00000001 00000003 c2422bc0 f7306b40 00000082 f5f6b980 211bd57b 00003243 c02af980 00000001 00000082 f6e4c000 f5f83310 f5f83330 c241abc0 0001b8e4 211bdfd4 00003243 f72fe228 0347143e f6e4dc98 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0143bc2>] do_page_cache_readahead+0x172/0x1e0 [<c013e511>] filemap_nopage+0x191/0x3d0 [<c013e380>] filemap_nopage+0x0/0x3d0 [<c014cfd3>] do_no_page+0xd3/0x3c0 [<c014acc7>] pte_alloc_map+0xc7/0x110 [<c014d4f6>] handle_mm_fault+0x106/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0170ba4>] poll_freewait+0x44/0x50 [<c01719d2>] sys_poll+0x272/0x2c0 [<c0170bb0>] __pollwait+0x0/0xd0 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 dnscache D A631E98B 0 761 759 (NOTLB) f6e2bc84 00000086 d8460080 ac271597 00003245 c2422bc0 f776f2a0 00000082 d8460080 ac271452 00003245 c02af980 00000001 ac271597 00003245 d8460080 d84600a0 c2422bc0 00001661 ac27189b 00003245 f77ad518 03473f51 f6e2bc98 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0143bc2>] do_page_cache_readahead+0x172/0x1e0 [<c013e511>] filemap_nopage+0x191/0x3d0 [<c013e380>] filemap_nopage+0x0/0x3d0 [<c014cfd3>] do_no_page+0xd3/0x3c0 [<c014acc7>] pte_alloc_map+0xc7/0x110 [<c014d4f6>] handle_mm_fault+0x106/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0170ba4>] poll_freewait+0x44/0x50 [<c01719d2>] sys_poll+0x272/0x2c0 [<c0170bb0>] __pollwait+0x0/0xd0 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 multilog S 00000001 4036 762 760 (NOTLB) f6de9eb4 00000086 c2422bc0 00000001 00000003 c013d359 f77a0b00 00000000 f7629900 c1a5c288 c02b0e40 c1a5c288 0001b38a 0001b38a 00000292 c0157bdf c034dac0 c2422bc0 00000d4b d73a9523 00001a38 f7629ac8 f739b66c f739b600 Call Trace: [<c013d359>] __lock_page+0xb9/0xd0 [<c0157bdf>] swap_free+0x2f/0x50 [<c0169f8e>] pipe_wait+0x7e/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c016a19f>] pipe_readv+0x1ef/0x2f0 [<c016a2d8>] pipe_read+0x38/0x40 [<c015c4a8>] vfs_read+0xb8/0x130 [<c012f0af>] sys_rt_sigprocmask+0xbf/0x190 [<c015c752>] sys_read+0x42/0x70 [<c01092bb>] syscall_call+0x7/0xb httpd D 24562C9D 0 785 1 2898 828 758 (NOTLB) f72a9c84 00000082 00000000 24562c9d 00003243 c2422bc0 f776fb20 00000082 f5f6b980 24562c9d 00003243 c02af980 00000001 00000082 f72a8000 f7756d40 f7756d60 c241abc0 000072d3 24563c20 00003243 f72fe858 03471475 f72a9c98 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0143bc2>] do_page_cache_readahead+0x172/0x1e0 [<c013d1b5>] unlock_page+0x15/0x60 [<c013e511>] filemap_nopage+0x191/0x3d0 [<c013e380>] filemap_nopage+0x0/0x3d0 [<c014cfd3>] do_no_page+0xd3/0x3c0 [<c0157bdf>] swap_free+0x2f/0x50 [<c014acc7>] pte_alloc_map+0xc7/0x110 [<c014d4f6>] handle_mm_fault+0x106/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0170bb0>] __pollwait+0x0/0xd0 [<c0171334>] sys_select+0x264/0x520 c0109d25>] error_code+0x2d/0x38 mysqld_safe S 00000000 0 835 1 924 851 828 (NOTLB) f693df50 00000086 7dda6067 00000000 f696bb60 f696bb80 f696bb60 f77acd20 c011a94c f696bb60 f69a9620 080c9f9c 00000001 00000001 080c9f9c f71f26d0 f71f26f0 c241abc0 0001663f 2f2e593c 00000008 f77acee8 fffffe00 f693c000 Call Trace: [<c011a94c>] do_page_fault+0x33c/0x530 [<c01254ab>] sys_wait4+0x1bb/0x290 [<c011d8e0>] default_wake_function+0x0/0x20 [<c012f0f3>] sys_rt_sigprocmask+0x103/0x190 [<c011d8e0>] default_wake_function+0x0/0x20 [<c01092bb>] syscall_call+0x7/0xb qmail-send D 3F7BF822 0 851 1 864 900 835 (NOTLB) f7603c84 00000086 d8460080 457110f1 00003246 c2422bc0 f696b0c0 00000082 d8460080 45710fc1 00003246 c02af980 00000001 457110f1 00003246 d8460080 d84600a0 c2422bc0 00001710 4b664525 00003246 f72ec248 0347495e f7603c98 Call Trace: [<c012b62c>] schedule_timeout+0x6c/0xc0 [<c0142b11>] wakeup_bdflush+0x21/0x40 [<c012b5b0>] process_timeout+0x0/0x10 [<c011eb7b>] io_schedule_timeout+0x2b/0x40 [<c01f54a4>] blk_congestion_wait+0x84/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c0148f9f>] try_to_free_pages+0xef/0x190 [<c014187c>] __alloc_pages+0x1dc/0x350 [<c0143bc2>] do_page_cache_readahead+0x172/0x1e0 [<c014ca8f>] do_swap_page+0x13f/0x320 [<c013e511>] filemap_nopage+0x191/0x3d0 [<c013e380>] filemap_nopage+0x0/0x3d0 [<c014cfd3>] do_no_page+0xd3/0x3c0 [<c014acc7>] pte_alloc_map+0xc7/0x110 [<c014d4f6>] handle_mm_fault+0x106/0x1b0 [<c011a94c>] do_page_fault+0x33c/0x530 [<c0171334>] sys_select+0x264/0x520 [<c011a610>] do_page_fault+0x0/0x530 [<c0109d25>] error_code+0x2d/0x38 splogger S 114C3021 5660 864 851 865 (NOTLB) f684beb4 00000086 f7da7330 114c3021 02002c19 00000003 f68addc0 00000009 f684bea4 00000000 c021dd7c f684d9a0 00000000 f7140580 f684bf90 f7da7330 f7da7350 c241abc0 00000a82 c14af7c2 000019e4 f684db68 c26e038c c26e0320 Call Trace: [<c021dd7c>] sockfd_lookup+0x1c/0x80 [<c0169f8e>] pipe_wait+0x7e/0xa0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c01790da>] update_atime+0x9a/0xe0 [<c011fda0>] autoremove_wake_function+0x0/0x50 [<c016a19f>] pipe_readv+0x1ef/0x2f0 [<c016a2d8>] pipe_read+0x38/0x40 [<c015c4a8>] vfs_read+0xb8/0x130 [<c010fa00>] do_gettimeofday+0x20/0xc0 [<c015c752>] sys_read+0x42/0x70 [<c01092bb>] syscall_call+0x7/0xb qmail-lspawn S 00000001 0 865 851 866 864 (NOTLB) f685feb0 00000086 c241abc0 00000001 00000003 c138a310 f689b960 000001d5 f685ff40 00000000 c0141740 c02afe00 00000000 00000000 c14176c1 00000000 f71f2d00 c241abc0 000028b6 c141be0d 000019e4 f71f2ec8 00000000 7fffffff Call Trace: [<c0141740>] __alloc_pages+0xa0/0x350 [<c012b67e>] schedule_timeout+0xbe/0xc0 -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-21 16:45 ` Sergey S. Kostyliov @ 2004-02-21 19:30 ` Andrew Morton 2004-02-22 17:39 ` Alexander Y. Fomichev 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2004-02-21 19:30 UTC (permalink / raw) To: Sergey S. Kostyliov; +Cc: linux-kernel, anton "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > Hello Andrew, > > On Sunday 01 February 2004 03:17, Andrew Morton wrote: > > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > > > I had experienced a lockups on three of my servers with 2.6.1. It doesn't > > > look like a deadlock, the box is still pingable and all tcp ports which were > > > in listen state before a lockup are remains in listen state, but I can't get > > > any data from this ports. According to sar(1) systems had not been overloaded > > > right before a lockup. And there is no log entries in all user services logs > > > for almost 10 hours after lockup. > > > > Please ensure that CONFIG_KALLSYMS is enabled, then generate an all-tasks > > backtrace or a locked machine with sysrq-T or `echo t > > > /proc/sysrq-trigger'. Then send us the resulting trace. > > I've just reproduced this lockup with 2.6.3. > > > > > You may need a serial console to be able to capture all the output. > > > > Also, it would be useful to know what sort of load the machines are under, > > and what filesystems are in use. > > The machine is a http server. The main applications are: > 1) apache 1.3 which serves php pages (mod_php): > 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request > 54 requests currently being processed, 19 idle servers > 2) mysql: > Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980 > Flush tables: 1 Open tables: 630 Queries per second avg: 143.547 > > This is an IO bound machine in general. All filesystems are reiserfs. > > Here is a sysrq-T output obtained from a locked box via serail console: OK, so everything is stuck trying to allocate memory. Perhaps you ran out of swapspace, or some process has gone berzerk allocating memory. How much memory does the machine have, and how much swap space? I suggest that you run a `vmstat 30' trace on a terminal somewhere, see what it says prior to the hangs. Also capture the sysrq-M output after it has hung. It would be useful to monitor the contents of /proc/vmstat also. And perhaps keep top running in `sort by memory usage' mode. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-21 19:30 ` Andrew Morton @ 2004-02-22 17:39 ` Alexander Y. Fomichev 2004-02-23 17:27 ` Sergey S. Kostyliov 0 siblings, 1 reply; 24+ messages in thread From: Alexander Y. Fomichev @ 2004-02-22 17:39 UTC (permalink / raw) To: Andrew Morton; +Cc: Sergey S. Kostyliov, linux-kernel, anton On Saturday 21 February 2004 22:30, Andrew Morton wrote: > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > Hello Andrew, > > > > On Sunday 01 February 2004 03:17, Andrew Morton wrote: > > > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > I had experienced a lockups on three of my servers with 2.6.1. It > > > > doesn't look like a deadlock, the box is still pingable and all tcp > > > > ports which were in listen state before a lockup are remains in > > > > listen state, but I can't get any data from this ports. According to > > > > sar(1) systems had not been overloaded right before a lockup. And > > > > there is no log entries in all user services logs for almost 10 hours > > > > after lockup. > > > > > > Please ensure that CONFIG_KALLSYMS is enabled, then generate an > > > all-tasks backtrace or a locked machine with sysrq-T or `echo t > > > > /proc/sysrq-trigger'. Then send us the resulting trace. > > > > I've just reproduced this lockup with 2.6.3. > > > > > You may need a serial console to be able to capture all the output. > > > > > > Also, it would be useful to know what sort of load the machines are > > > under, and what filesystems are in use. > > > > The machine is a http server. The main applications are: > > 1) apache 1.3 which serves php pages (mod_php): > > 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request > > 54 requests currently being processed, 19 idle servers > > 2) mysql: > > Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980 > > Flush tables: 1 Open tables: 630 Queries per second avg: 143.547 > > > > This is an IO bound machine in general. All filesystems are reiserfs. > > > > Here is a sysrq-T output obtained from a locked box via serail console: > > OK, so everything is stuck trying to allocate memory. Perhaps you ran out > of swapspace, or some process has gone berzerk allocating memory. > > How much memory does the machine have, and how much swap space? > # free total used free shared buffers cached Mem: 2073868 2067508 6360 0 232708 897828 -/+ buffers/cache: 936972 1136896 Swap: 1535976 5228 1530748 > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see > what it says prior to the hangs. Ok.We'll try to get it next time. > Also capture the sysrq-M output after it > has hung. > This "showmem" && "showreg" have been taken just before "SysRq: Show State" from previous message. SysRq : Show Memory Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 3172kB (512kB HighMem) Active:1783 inactive:87 dirty:0 writeback:0 unstable:0 free:793 DMA free:1292kB min:16kB low:32kB high:48kB active:3748kB inactive:0kB Normal free:1368kB min:936kB low:1872kB high:2808kB active:1368kB inactive:356kB HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB inactive:0kB DMA: 151*4kB 70*8kB 6*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = B Normal: 192*4kB 9*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB B HighMem: 0*4kB 2*8kB 3*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =B Swap cache: add 1140128, delete 1140063, find 459572/584559, race 145+217 Free swap: 384364kB 524288 pages of RAM 294912 pages of HIGHMEM 5821 reserved pages 976 pages shared 65 pages swap cached SysRq : Show Regs Pid: 0, comm: swapper EIP: 0060:[<c0106d1c>] CPU: 0 EIP is at default_idle+0x2c/0x40 EFLAGS: 00000246 Not tainted EAX: 00000000 EBX: c02e6000 ECX: c0106cf0 EDX: c02e6000 ESI: c02e6000 EDI: c0105000 EBP: 0008e000 DS: 007b ES: 007b CR0: 8005003b CR2: bffff7e0 CR3: 2d021000 CR4: 00000690 Call Trace: [<c0106dab>] cpu_idle+0x3b/0x50 [<c02e88e9>] start_kernel+0x179/0x1a0 [<c02e84a0>] unknown_bootoption+0x0/0x120 > It would be useful to monitor the contents of /proc/vmstat also. > > And perhaps keep top running in `sort by memory usage' mode. ok, we'll try too. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- < on behalf of "Sergey S. Kostyliov" <rathamahata@php4.ru> > Best regards. Alexander Y. Fomichev <gluk@php4.ru> Public PGP key: http://sysadminday.org.ru/gluk.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-22 17:39 ` Alexander Y. Fomichev @ 2004-02-23 17:27 ` Sergey S. Kostyliov 2004-02-23 21:30 ` Mike Fedyk 2004-02-23 22:26 ` Andrew Morton 0 siblings, 2 replies; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-23 17:27 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, Alexander Y. Fomichev, anton Hello Andrew, Now this happens for the third time. > > > I've just reproduced this lockup with 2.6.3. > > > > > > > You may need a serial console to be able to capture all the output. > > > > > > > > Also, it would be useful to know what sort of load the machines are > > > > under, and what filesystems are in use. > > > > > > The machine is a http server. The main applications are: > > > 1) apache 1.3 which serves php pages (mod_php): > > > 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request > > > 54 requests currently being processed, 19 idle servers > > > 2) mysql: > > > Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980 > > > Flush tables: 1 Open tables: 630 Queries per second avg: 143.547 > > > > > > This is an IO bound machine in general. All filesystems are reiserfs. > > > > > > Here is a sysrq-T output obtained from a locked box via serail console: > > > > OK, so everything is stuck trying to allocate memory. Perhaps you ran out > > of swapspace, or some process has gone berzerk allocating memory. The memory exhaustion is indeed possible for this box. I'll double check ulimit and /etc/security/limits.conf stuff. The only thing which worries me that this box had been running for months without any problems with 2.4.23aa1. I have added another 2Gb to swap space (hope this give enough time to find the memory hungry process(es)). > > > > How much memory does the machine have, and how much swap space? > > > # free > total used free shared buffers cached > Mem: 2073868 2067508 6360 0 232708 897828 > -/+ buffers/cache: 936972 1136896 > Swap: 1535976 5228 1530748 > > > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see > > what it says prior to the hangs. > Ok.We'll try to get it next time. Here it is: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 551920 8108 203744 933532 0 0 4 68 1214 426 5 1 92 2 0 0 551928 7140 203756 930316 0 0 17 61 1240 529 8 1 89 2 0 0 551976 5788 203772 928224 1 6 360 139 1297 317 7 2 83 8 0 0 551968 7588 203812 923504 0 0 19 125 1303 308 8 2 87 4 0 1 551976 10444 203892 914100 0 0 25 127 1433 438 10 3 85 3 0 0 551976 9220 204004 914804 0 0 123 126 1278 325 6 1 88 5 0 0 551976 8108 204044 912248 0 0 38 69 1279 291 6 1 91 2 0 1 551976 11828 204144 912320 1 0 135 94 1249 296 6 1 89 3 0 5 562204 3280 203952 157084 1 566 305 674 1281 313 6 4 73 17 0 18 598224 4276 1888 33356 91 2734 233 2761 1090 199 0 2 0 97 1 38 662520 2760 2104 30520 110 3721 261 3738 1161 831 1 2 0 97 10 41 699936 2772 1920 28716 123 2924 249 2946 1103 1273 0 3 0 97 0 39 748588 2956 1956 22668 160 3313 245 3331 1056 1047 0 2 0 98 0 38 796100 3108 1888 21348 321 3191 430 3206 1045 1002 0 2 0 97 4 43 844532 3308 1956 17644 518 3719 670 3733 1357 999 0 2 0 98 0 51 882596 2940 2052 13960 520 2796 705 2810 1048 1182 0 2 0 98 3 59 913392 2456 2048 10900 1013 2524 1308 2542 1144 601 0 2 0 98 5 71 937816 2760 2072 8584 1534 2681 1860 2702 1234 607 0 2 0 97 > > > Also capture the sysrq-M output after it > > has hung. > > > This "showmem" && "showreg" have been taken just before > "SysRq: Show State" from previous message. > > SysRq : Show Memory > Mem-info: > DMA per-cpu: > cpu 0 hot: low 2, high 6, batch 1 > cpu 0 cold: low 0, high 2, batch 1 > cpu 1 hot: low 2, high 6, batch 1 > cpu 1 cold: low 0, high 2, batch 1 > Normal per-cpu: > cpu 0 hot: low 32, high 96, batch 16 > cpu 0 cold: low 0, high 32, batch 16 > cpu 1 hot: low 32, high 96, batch 16 > cpu 1 cold: low 0, high 32, batch 16 > HighMem per-cpu: > cpu 0 hot: low 32, high 96, batch 16 > cpu 0 cold: low 0, high 32, batch 16 > cpu 1 hot: low 32, high 96, batch 16 > cpu 1 cold: low 0, high 32, batch 16 > > Free pages: 3172kB (512kB HighMem) > Active:1783 inactive:87 dirty:0 writeback:0 unstable:0 free:793 > DMA free:1292kB min:16kB low:32kB high:48kB active:3748kB inactive:0kB > Normal free:1368kB min:936kB low:1872kB high:2808kB active:1368kB > inactive:356kB > HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB > inactive:0kB > DMA: 151*4kB 70*8kB 6*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB > 0*2048kB 0*4096kB = B > Normal: 192*4kB 9*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 1*512kB > 0*1024kB 0*2048kB 0*4096kB B > HighMem: 0*4kB 2*8kB 3*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB > 0*2048kB 0*4096kB =B > Swap cache: add 1140128, delete 1140063, find 459572/584559, race 145+217 > Free swap: 384364kB > 524288 pages of RAM > 294912 pages of HIGHMEM > 5821 reserved pages > 976 pages shared > 65 pages swap cached > > > SysRq : Show Regs > > Pid: 0, comm: swapper > EIP: 0060:[<c0106d1c>] CPU: 0 > EIP is at default_idle+0x2c/0x40 > EFLAGS: 00000246 Not tainted > EAX: 00000000 EBX: c02e6000 ECX: c0106cf0 EDX: c02e6000 > ESI: c02e6000 EDI: c0105000 EBP: 0008e000 DS: 007b ES: 007b > CR0: 8005003b CR2: bffff7e0 CR3: 2d021000 CR4: 00000690 > Call Trace: > [<c0106dab>] cpu_idle+0x3b/0x50 > [<c02e88e9>] start_kernel+0x179/0x1a0 > [<c02e84a0>] unknown_bootoption+0x0/0x120 I forgot to switch output capture on in minicom, so the sysrq-M was scrolled out of the terminal by subsequent sysrq-T, largest part of which was in turn scrolled out. But the sysrq-T part is almost the same as previous one. > > > It would be useful to monitor the contents of /proc/vmstat also. The last /proc/vmstat content is 5 minutes before a real lockup (It looks like simple "while true; do date; cat /proc/vmstat; sleep 10; done" script suffer from the same memory exhaustion problem.) Mon Feb 23 17:41:34 MSK 2004 nr_dirty 136 nr_writeback 0 nr_unstable 0 nr_page_table_pages 987 nr_mapped 227018 nr_slab 13041 pgpgin 8593704 pgpgout 4349808 pswpin 169183 pswpout 183480 pgalloc 20244471 pgfree 20247061 pgactivate 548813 pgdeactivate 628769 pgfault 25756129 pgmajfault 67820 pgscan 4570640 pgrefill 2934423 pgsteal 2024118 pginodesteal 0 kswapd_steal 1886046 kswapd_inodesteal 891 pageoutrun 10047 allocstall 3930 pgrotated 178662 Mon Feb 23 17:41:44 MSK 2004 nr_dirty 339 nr_writeback 0 nr_unstable 0 nr_page_table_pages 991 nr_mapped 226443 nr_slab 13036 pgpgin 8593956 pgpgout 4351080 pswpin 169186 pswpout 183480 pgalloc 20250240 pgfree 20253382 pgactivate 549009 pgdeactivate 628769 pgfault 25764719 pgmajfault 67827 pgscan 4570640 pgrefill 2934423 pgsteal 2024118 pginodesteal 0 kswapd_steal 1886046 kswapd_inodesteal 891 pageoutrun 10047 allocstall 3930 pgrotated 178662 Mon Feb 23 17:41:54 MSK 2004 nr_dirty 505 nr_writeback 0 nr_unstable 0 nr_page_table_pages 993 nr_mapped 226477 nr_slab 13049 pgpgin 8594244 pgpgout 4352144 pswpin 169186 pswpout 183480 pgalloc 20256355 pgfree 20259400 pgactivate 549048 pgdeactivate 628769 pgfault 25772385 pgmajfault 67837 pgscan 4570640 pgrefill 2934423 pgsteal 2024118 pginodesteal 0 kswapd_steal 1886046 kswapd_inodesteal 891 pageoutrun 10047 allocstall 3930 pgrotated 178662 Mon Feb 23 17:42:15 MSK 2004 nr_dirty 0 nr_writeback 765 nr_unstable 0 nr_page_table_pages 1044 nr_mapped 209677 nr_slab 4672 pgpgin 8605592 pgpgout 4424120 pswpin 169454 pswpout 201127 pgalloc 20561829 pgfree 20563033 pgactivate 601317 pgdeactivate 778533 pgfault 25777874 pgmajfault 68001 pgscan 5399589 pgrefill 3543496 pgsteal 2300249 pginodesteal 0 kswapd_steal 2058168 kswapd_inodesteal 14284 pageoutrun 10114 allocstall 7008 pgrotated 193130 Mon Feb 23 17:42:47 MSK 2004 nr_dirty 1 nr_writeback 597 nr_unstable 0 nr_page_table_pages 1213 nr_mapped 190410 nr_slab 4640 pgpgin 8614032 pgpgout 4500108 pswpin 170334 pswpout 219922 pgalloc 20588517 pgfree 20589474 pgactivate 711818 pgdeactivate 908805 pgfault 25783157 pgmajfault 68204 pgscan 5667215 pgrefill 3774369 pgsteal 2322731 pginodesteal 0 kswapd_steal 2066149 kswapd_inodesteal 14352 pageoutrun 10167 allocstall 7383 pgrotated 209403 > > > > And perhaps keep top running in `sort by memory usage' mode. > ok, we'll try too. Unfortunately the top output is kind of useless because mysql hide the real problem, I'll try to run top in batch mode next time. top - 17:47:03 up 7:10, 3 users, load average: 124.72, 66.96, 27.71 Tasks: 219 total, 1 running, 218 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2% us, 2.1% sy, 0.0% ni, 0.0% id, 97.6% wa, 0.1% hi, 0.0% si Mem: 2073868k total, 2070796k used, 3072k free, 1996k buffers Swap: 1535976k total, 944520k used, 591456k free, 6884k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 896 mysql 15 0 1013m 21m 4896 S 0.1 1.1 0:05.64 mysqld 939 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:00.05 mysqld 940 mysql 20 0 1013m 21m 4896 S 0.0 1.1 0:00.00 mysqld 941 mysql 17 0 1013m 21m 4896 S 0.0 1.1 0:00.00 mysqld 942 mysql 15 0 1013m 21m 4896 D 0.4 1.1 1:26.19 mysqld 943 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:00.00 mysqld 961 mysql 17 0 1013m 21m 4896 D 0.0 1.1 0:00.00 mysqld 962 mysql 15 0 1013m 21m 4896 D 0.0 1.1 0:13.12 mysqld 971 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:00.05 mysqld 972 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:03.12 mysqld 27314 mysql 15 0 1013m 21m 4896 D 0.0 1.1 0:11.73 mysqld 27325 mysql 15 0 1013m 21m 4896 S 0.0 1.1 0:08.70 mysqld 27339 mysql 15 0 1013m 21m 4896 S 0.0 1.1 0:07.78 mysqld 27361 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:10.05 mysqld 27375 mysql 15 0 1013m 21m 4896 S 0.0 1.1 0:10.61 mysqld 27390 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:11.44 mysqld 27392 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:09.20 mysqld 27393 mysql 16 0 1013m 21m 4896 S 0.0 1.1 0:12.23 mysqld 3671 mysql 16 0 1013m 21m 4896 D 0.0 1.1 0:00.04 mysqld 3672 mysql 18 0 1013m 21m 4896 D 0.0 1.1 0:00.11 mysqld 3691 mysql 16 0 1013m 21m 4896 D 0.1 1.1 0:00.02 mysqld 3704 mysql 17 0 1013m 21m 4896 S 0.0 1.1 0:00.02 mysqld Thank you for your help! -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-23 17:27 ` Sergey S. Kostyliov @ 2004-02-23 21:30 ` Mike Fedyk 2004-02-24 11:56 ` Sergey S. Kostyliov 2004-02-23 22:26 ` Andrew Morton 1 sibling, 1 reply; 24+ messages in thread From: Mike Fedyk @ 2004-02-23 21:30 UTC (permalink / raw) To: Sergey S. Kostyliov Cc: Andrew Morton, linux-kernel, Alexander Y. Fomichev, anton Sergey S. Kostyliov wrote: > Hello Andrew, > > Now this happens for the third time. > > >>>>I've just reproduced this lockup with 2.6.3. >>>> >>>> >>>>>You may need a serial console to be able to capture all the output. >>>>> >>>>>Also, it would be useful to know what sort of load the machines are >>>>>under, and what filesystems are in use. >>>> >>>>The machine is a http server. The main applications are: >>>>1) apache 1.3 which serves php pages (mod_php): >>>> 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request >>>> 54 requests currently being processed, 19 idle servers >>>>2) mysql: >>>> Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980 >>>> Flush tables: 1 Open tables: 630 Queries per second avg: 143.547 >>>> >>>>This is an IO bound machine in general. All filesystems are reiserfs. >>>> >>>>Here is a sysrq-T output obtained from a locked box via serail console: >>> >>>OK, so everything is stuck trying to allocate memory. Perhaps you ran out >>>of swapspace, or some process has gone berzerk allocating memory. > > > The memory exhaustion is indeed possible for this box. I'll double check > ulimit and /etc/security/limits.conf stuff. The only thing which worries > me that this box had been running for months without any problems with > 2.4.23aa1. > > I have added another 2Gb to swap space (hope this give enough time > to find the memory hungry process(es)). Also check how much memory is being used for slab in /proc/meminfo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-23 21:30 ` Mike Fedyk @ 2004-02-24 11:56 ` Sergey S. Kostyliov 0 siblings, 0 replies; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-24 11:56 UTC (permalink / raw) To: Mike Fedyk; +Cc: Andrew Morton, linux-kernel, Alexander Y. Fomichev, anton On Tuesday 24 February 2004 00:30, Mike Fedyk wrote: > Sergey S. Kostyliov wrote: > > Hello Andrew, > > > > Now this happens for the third time. > > > > > >>>>I've just reproduced this lockup with 2.6.3. > >>>> > >>>> > >>>>>You may need a serial console to be able to capture all the output. > >>>>> > >>>>>Also, it would be useful to know what sort of load the machines are > >>>>>under, and what filesystems are in use. > >>>> > >>>>The machine is a http server. The main applications are: > >>>>1) apache 1.3 which serves php pages (mod_php): > >>>> 15.3 requests/sec - 111.9 kB/second - 7.3 kB/request > >>>> 54 requests currently being processed, 19 idle servers > >>>>2) mysql: > >>>> Threads: 19 Questions: 26922012 Slow queries: 9799 Opens: 64980 > >>>> Flush tables: 1 Open tables: 630 Queries per second avg: 143.547 > >>>> > >>>>This is an IO bound machine in general. All filesystems are reiserfs. > >>>> > >>>>Here is a sysrq-T output obtained from a locked box via serail console: > >>> > >>>OK, so everything is stuck trying to allocate memory. Perhaps you ran out > >>>of swapspace, or some process has gone berzerk allocating memory. > > > > > > The memory exhaustion is indeed possible for this box. I'll double check > > ulimit and /etc/security/limits.conf stuff. The only thing which worries > > me that this box had been running for months without any problems with > > 2.4.23aa1. > > > > I have added another 2Gb to swap space (hope this give enough time > > to find the memory hungry process(es)). > > Also check how much memory is being used for slab in /proc/meminfo Thanks for the hint, will do this next time. -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-23 17:27 ` Sergey S. Kostyliov 2004-02-23 21:30 ` Mike Fedyk @ 2004-02-23 22:26 ` Andrew Morton 2004-02-24 7:23 ` Marcelo Tosatti 2004-02-24 11:54 ` Sergey S. Kostyliov 1 sibling, 2 replies; 24+ messages in thread From: Andrew Morton @ 2004-02-23 22:26 UTC (permalink / raw) To: Sergey S. Kostyliov; +Cc: linux-kernel, gluk, anton "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > OK, so everything is stuck trying to allocate memory. Perhaps you ran out > > > of swapspace, or some process has gone berzerk allocating memory. > > The memory exhaustion is indeed possible for this box. I'll double check > ulimit and /etc/security/limits.conf stuff. The only thing which worries > me that this box had been running for months without any problems with > 2.4.23aa1. It is conceivable that you have some application which runs OK on 2.4.x but has some subtle bug which causes the app to go crazy on a 2.6 kernel consuming lots of memory. Or there's a bug in the 2.6 kernel ;) > I have added another 2Gb to swap space (hope this give enough time > to find the memory hungry process(es)). > > > > > > > How much memory does the machine have, and how much swap space? > > > > > # free > > total used free shared buffers cached > > Mem: 2073868 2067508 6360 0 232708 897828 > > -/+ buffers/cache: 936972 1136896 > > Swap: 1535976 5228 1530748 > > > > > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see > > > what it says prior to the hangs. > > Ok.We'll try to get it next time. > > Here it is: > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 1 0 551920 8108 203744 933532 0 0 4 68 1214 426 5 1 92 2 > 0 0 551928 7140 203756 930316 0 0 17 61 1240 529 8 1 89 2 > 0 0 551976 5788 203772 928224 1 6 360 139 1297 317 7 2 83 8 > 0 0 551968 7588 203812 923504 0 0 19 125 1303 308 8 2 87 4 > 0 1 551976 10444 203892 914100 0 0 25 127 1433 438 10 3 85 3 > 0 0 551976 9220 204004 914804 0 0 123 126 1278 325 6 1 88 5 > 0 0 551976 8108 204044 912248 0 0 38 69 1279 291 6 1 91 2 > 0 1 551976 11828 204144 912320 1 0 135 94 1249 296 6 1 89 3 > 0 5 562204 3280 203952 157084 1 566 305 674 1281 313 6 4 73 17 > 0 18 598224 4276 1888 33356 91 2734 233 2761 1090 199 0 2 0 97 > 1 38 662520 2760 2104 30520 110 3721 261 3738 1161 831 1 2 0 97 > 10 41 699936 2772 1920 28716 123 2924 249 2946 1103 1273 0 3 0 97 > 0 39 748588 2956 1956 22668 160 3313 245 3331 1056 1047 0 2 0 98 > 0 38 796100 3108 1888 21348 321 3191 430 3206 1045 1002 0 2 0 97 > 4 43 844532 3308 1956 17644 518 3719 670 3733 1357 999 0 2 0 98 > 0 51 882596 2940 2052 13960 520 2796 705 2810 1048 1182 0 2 0 98 > 3 59 913392 2456 2048 10900 1013 2524 1308 2542 1144 601 0 2 0 98 > 5 71 937816 2760 2072 8584 1534 2681 1860 2702 1234 607 0 2 0 97 OK, so it's doing a lot of swapping and your swap utilisation is continuously increasing. I would suspect an application or kernel memory leak. I suggest you keep that `vmstat 30' running all the time. When the machine dies, take a look at the final 20 lines. Also, run while true do cat /proc/meminfo sleep 10 done and record the info which that leaves behind when the machine locks up. This should tell us whether it is an application or kernel memory leak. If it is indeed a leak. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-23 22:26 ` Andrew Morton @ 2004-02-24 7:23 ` Marcelo Tosatti 2004-02-24 6:53 ` Andrew Morton 2004-02-24 11:54 ` Sergey S. Kostyliov 1 sibling, 1 reply; 24+ messages in thread From: Marcelo Tosatti @ 2004-02-24 7:23 UTC (permalink / raw) To: Andrew Morton; +Cc: Sergey S. Kostyliov, linux-kernel, gluk, anton On Mon, 23 Feb 2004, Andrew Morton wrote: > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > > > OK, so everything is stuck trying to allocate memory. Perhaps you ran out > > > > of swapspace, or some process has gone berzerk allocating memory. > > > > The memory exhaustion is indeed possible for this box. I'll double check > > ulimit and /etc/security/limits.conf stuff. The only thing which worries > > me that this box had been running for months without any problems with > > 2.4.23aa1. > > It is conceivable that you have some application which runs OK on 2.4.x but > has some subtle bug which causes the app to go crazy on a 2.6 kernel > consuming lots of memory. Or there's a bug in the 2.6 kernel ;) > > > I have added another 2Gb to swap space (hope this give enough time > > to find the memory hungry process(es)). > > > > > > > > > > How much memory does the machine have, and how much swap space? > > > > > > > # free > > > total used free shared buffers cached > > > Mem: 2073868 2067508 6360 0 232708 897828 > > > -/+ buffers/cache: 936972 1136896 > > > Swap: 1535976 5228 1530748 > > > > > > > I suggest that you run a `vmstat 30' trace on a terminal somewhere, see > > > > what it says prior to the hangs. > > > Ok.We'll try to get it next time. > > > > Here it is: > > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- > > r b swpd free buff cache si so bi bo in cs us sy id wa > > 1 0 551920 8108 203744 933532 0 0 4 68 1214 426 5 1 92 2 > > 0 0 551928 7140 203756 930316 0 0 17 61 1240 529 8 1 89 2 > > 0 0 551976 5788 203772 928224 1 6 360 139 1297 317 7 2 83 8 > > 0 0 551968 7588 203812 923504 0 0 19 125 1303 308 8 2 87 4 > > 0 1 551976 10444 203892 914100 0 0 25 127 1433 438 10 3 85 3 > > 0 0 551976 9220 204004 914804 0 0 123 126 1278 325 6 1 88 5 > > 0 0 551976 8108 204044 912248 0 0 38 69 1279 291 6 1 91 2 > > 0 1 551976 11828 204144 912320 1 0 135 94 1249 296 6 1 89 3 > > 0 5 562204 3280 203952 157084 1 566 305 674 1281 313 6 4 73 17 > > 0 18 598224 4276 1888 33356 91 2734 233 2761 1090 199 0 2 0 97 > > 1 38 662520 2760 2104 30520 110 3721 261 3738 1161 831 1 2 0 97 > > 10 41 699936 2772 1920 28716 123 2924 249 2946 1103 1273 0 3 0 97 > > 0 39 748588 2956 1956 22668 160 3313 245 3331 1056 1047 0 2 0 98 > > 0 38 796100 3108 1888 21348 321 3191 430 3206 1045 1002 0 2 0 97 > > 4 43 844532 3308 1956 17644 518 3719 670 3733 1357 999 0 2 0 98 > > 0 51 882596 2940 2052 13960 520 2796 705 2810 1048 1182 0 2 0 98 > > 3 59 913392 2456 2048 10900 1013 2524 1308 2542 1144 601 0 2 0 98 > > 5 71 937816 2760 2072 8584 1534 2681 1860 2702 1234 607 0 2 0 97 > > OK, so it's doing a lot of swapping and your swap utilisation is > continuously increasing. I would suspect an application or kernel memory > leak. > > I suggest you keep that `vmstat 30' running all the time. When the machine > dies, take a look at the final 20 lines. > > Also, run > > while true > do > cat /proc/meminfo > sleep 10 > done > > and record the info which that leaves behind when the machine locks up. > This should tell us whether it is an application or kernel memory leak. If > it is indeed a leak. Hi Andrew, Care to explain me why should the kernel hang if due to an application leak ? The hang looks wrong even if the leak is in userspace app, yes? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-24 7:23 ` Marcelo Tosatti @ 2004-02-24 6:53 ` Andrew Morton 0 siblings, 0 replies; 24+ messages in thread From: Andrew Morton @ 2004-02-24 6:53 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: rathamahata, linux-kernel, gluk, anton Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote: > > > Also, run > > > > while true > > do > > cat /proc/meminfo > > sleep 10 > > done > > > > and record the info which that leaves behind when the machine locks up. > > This should tell us whether it is an application or kernel memory leak. If > > it is indeed a leak. > > Hi Andrew, > > Care to explain me why should the kernel hang if due to an application > leak ? It shouldn't - the oom killer should have done something. But we'll address that once we've confirmed that something really is leaking. > The hang looks wrong even if the leak is in userspace app, yes? Probably, yes. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-23 22:26 ` Andrew Morton 2004-02-24 7:23 ` Marcelo Tosatti @ 2004-02-24 11:54 ` Sergey S. Kostyliov 2004-02-26 12:19 ` Sergey S. Kostyliov 1 sibling, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-24 11:54 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, gluk, anton On Tuesday 24 February 2004 01:26, Andrew Morton wrote: > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: <cut> > > The memory exhaustion is indeed possible for this box. I'll double check > > ulimit and /etc/security/limits.conf stuff. The only thing which worries > > me that this box had been running for months without any problems with > > 2.4.23aa1. > > It is conceivable that you have some application which runs OK on 2.4.x but > has some subtle bug which causes the app to go crazy on a 2.6 kernel > consuming lots of memory. Or there's a bug in the 2.6 kernel ;) > > > I have added another 2Gb to swap space (hope this give enough time > > to find the memory hungry process(es)). <cut> > > OK, so it's doing a lot of swapping and your swap utilisation is > continuously increasing. I would suspect an application or kernel memory > leak. > > I suggest you keep that `vmstat 30' running all the time. When the machine > dies, take a look at the final 20 lines. Here is from the last lockup: 1) last 20 entries of the `vmstat 30': procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 116676 7752 266156 621360 8 1 1031 186 1364 444 53 5 30 12 1 0 116656 7512 266316 617716 2 3 334 79 1355 334 59 4 34 3 1 0 116240 8072 266800 616444 17 1 539 302 1397 464 59 7 29 6 1 0 116216 13320 266948 614044 1 1 1229 92 1505 587 61 6 27 6 2 0 116208 8344 267152 618048 1 0 436 143 1367 386 58 5 32 5 1 1 116216 6024 267308 619188 0 59 4574 164 1554 742 61 6 20 12 1 1 116284 6468 267736 614028 4 2 1087 117 1458 529 60 7 27 6 1 0 116280 6336 267888 617860 1 0 1225 101 1419 542 59 6 30 6 2 1 116472 7264 268148 619288 0 4 7788 100 1645 950 33 6 29 33 1 1 116728 5976 268296 617112 0 7 7799 86 1566 815 30 6 32 32 2 0 116752 6080 268488 615992 6 8 7434 136 1627 910 34 7 25 34 0 1 116944 6368 268588 615420 1 4 7601 95 1696 952 39 6 25 30 1 0 116968 30600 268896 585832 0 4 2212 176 1584 642 62 7 16 15 0 1 116968 6128 269064 604912 0 0 1410 67 1460 532 60 5 29 6 1 0 116964 6280 269308 604008 0 4 7449 106 1561 819 35 5 30 30 0 1 116976 6080 269400 603208 1 0 7317 121 1535 762 31 6 31 32 1 16 331784 4452 2488 25132 30 7369 1916 7441 1177 333 7 6 6 81 1 26 627540 3116 2172 23156 134 10159 217 10173 1159 200 0 4 0 96 5 29 884564 3144 2036 16032 468 9443 622 9471 1106 435 0 5 0 95 0 50 1097880 2800 2108 8592 484 7141 794 7164 1119 831 0 6 0 94 2) sysrq-M (This one looks strange to me because of "Free swap: 2326708kB") SysRq : Show Memory Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 2136kB (512kB HighMem) Active:832 inactive:103 dirty:0 writeback:0 unstable:0 free:534 DMA free:256kB min:16kB low:32kB high:48kB active:0kB inactive:0kB Normal free:1368kB min:936kB low:1872kB high:2808kB active:1380kB inactive:352kB HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB inactive:0kB DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 256kB Normal: 170*4kB 10*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1368kB HighMem: 8*4kB 0*8kB 2*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB Swap cache: add 392088, delete 392033, find 22279/32705, race 16+47 Free swap: 2326708kB 524288 pages of RAM 294912 pages of HIGHMEM 5821 reserved pages 860 pages shared 55 pages swap cached 3) sysrq-T: http://sysadminday.org.ru/2.6.3-io_lockup/ope/sysrq-T 4) 3 last copies of /proc/vmstat Tue Feb 24 02:36:53 MSK 2004 nr_dirty 320 nr_writeback 0 nr_unstable 0 nr_page_table_pages 822 nr_mapped 289207 nr_slab 11709 pgpgin 15829228 pgpgout 18320340 pswpin 25882 pswpout 37006 pgalloc 28844087 pgfree 28845931 pgactivate 923552 pgdeactivate 760039 pgfault 25500106 pgmajfault 66503 pgscan 7611061 pgrefill 4989936 pgsteal 5628844 pginodesteal 0 kswapd_steal 5211828 kswapd_inodesteal 2958 pageoutrun 33148 allocstall 12799 pgrotated 205322 Tue Feb 24 02:37:03 MSK 2004 nr_dirty 566 nr_writeback 0 nr_unstable 0 nr_page_table_pages 823 nr_mapped 289174 nr_slab 11733 pgpgin 15917192 pgpgout 18321888 pswpin 25882 pswpout 37006 pgalloc 28886326 pgfree 28888201 pgactivate 923806 pgdeactivate 760254 pgfault 25519499 pgmajfault 66550 pgscan 7633363 pgrefill 5008883 pgsteal 5650891 pginodesteal 0 kswapd_steal 5233875 kswapd_inodesteal 2958 pageoutrun 33287 allocstall 12799 pgrotated 205322 Tue Feb 24 02:37:23 MSK 2004 nr_dirty 4 nr_writeback 4559 nr_unstable 0 nr_page_table_pages 962 nr_mapped 197703 nr_slab 4887 pgpgin 15935652 pgpgout 18698124 pswpin 26444 pswpout 130749 pgalloc 29203531 pgfree 29204764 pgactivate 927401 pgdeactivate 944643 pgfault 25525960 pgmajfault 66694 pgscan 9534651 pgrefill 6027760 pgsteal 5952333 pginodesteal 0 kswapd_steal 5421086 kswapd_inodesteal 4181 pageoutrun 33500 allocstall 16189 pgrotated 292969 Tue Feb 24 02:38:16 MSK 2004 nr_dirty 0 nr_writeback 1805 nr_unstable 0 nr_page_table_pages 1433 nr_mapped 102046 nr_slab 4782 pgpgin 15956340 pgpgout 19099784 pswpin 30206 pswpout 230912 pgalloc 29315002 pgfree 29316033 pgactivate 1082560 pgdeactivate 1202414 pgfault 25537663 pgmajfault 67369 pgscan 11280124 pgrefill 6802507 pgsteal 6058697 pginodesteal 0 kswapd_steal 5476702 kswapd_inodesteal 4257 pageoutrun 33668 allocstall 17610 pgrotated 391878 4) Full top output: top - 02:39:00 up 8:22, 3 users, load average: 76.16, 25.71, 10.41 Tasks: 225 total, 1 running, 224 sleeping, 0 stopped, 0 zombie Cpu(s): 0.4% us, 5.3% sy, 0.0% ni, 0.2% id, 93.9% wa, 0.2% hi, 0.0% si Mem: 2073868k total, 2071260k used, 2608k free, 2104k buffers Swap: 3583968k total, 1097884k used, 2486084k free, 8604k cached 25123 mysql 15 0 1002m 142m 4896 D 1.0 7.0 10:35.25 mysqld 25122 mysql 15 0 1002m 142m 4896 D 0.0 7.0 0:05.91 mysqld 24132 mysql 15 0 1002m 142m 4896 D 0.0 7.0 0:28.97 mysqld 24129 mysql 15 0 1002m 142m 4896 S 0.0 7.0 0:05.90 mysqld 24125 mysql 15 0 1002m 142m 4896 D 0.1 7.0 0:07.59 mysqld 5420 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:50.44 mysqld 4748 mysql 15 0 1002m 142m 4896 D 0.0 7.0 3:10.94 mysqld 4746 mysql 15 0 1002m 142m 4896 S 0.0 7.0 2:52.37 mysqld 970 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:19.57 mysqld 969 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:17.52 mysqld 968 mysql 15 0 1002m 142m 4896 D 0.1 7.0 0:15.47 mysqld 967 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:00.00 mysqld 958 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:15.64 mysqld 957 mysql 15 0 1002m 142m 4896 S 0.0 7.0 2:17.52 mysqld 956 mysql 16 0 1002m 142m 4896 S 0.0 7.0 0:00.16 mysqld 955 mysql 17 0 1002m 142m 4896 S 0.0 7.0 0:00.00 mysqld 954 mysql 15 0 1002m 142m 4896 S 0.0 7.0 0:00.01 mysqld 898 mysql 15 0 1002m 142m 4896 D 0.0 7.0 0:03.57 mysqld 30132 pricemat 25 0 88976 12m 1944 S 0.0 0.6 2:29.37 make_words 29381 apache 15 0 57948 3200 41m D 0.0 0.2 0:26.14 httpd 29652 apache 15 0 57920 3188 41m S 0.0 0.2 0:19.16 httpd 31015 apache 15 0 56456 2484 41m D 0.0 0.1 0:14.68 httpd 29155 apache 15 0 55064 2916 42m S 0.0 0.1 0:21.33 httpd 30281 apache 15 0 54756 5096 41m D 0.0 0.2 0:11.47 httpd 29638 apache 16 0 54744 3816 42m S 0.0 0.2 0:17.62 httpd 29540 apache 15 0 54436 4732 41m D 0.0 0.2 0:19.75 httpd 30153 apache 15 0 54404 3472 41m S 0.0 0.2 0:12.79 httpd 30123 apache 15 0 54356 4024 41m D 0.0 0.2 0:13.18 httpd 30116 apache 15 0 54316 3352 41m D 0.0 0.2 0:11.75 httpd 29647 apache 15 0 54308 4224 41m D 0.0 0.2 0:17.31 httpd 30134 apache 15 0 53416 2968 41m D 0.0 0.1 0:14.14 httpd 29651 apache 15 0 53040 3220 41m D 0.0 0.2 0:17.58 httpd 29013 apache 15 0 52888 4552 41m S 0.0 0.2 0:13.12 httpd 30619 apache 15 0 52824 3584 41m D 0.0 0.2 0:05.70 httpd 28174 apache 15 0 52692 3956 41m D 0.0 0.2 0:17.85 httpd 30926 apache 15 0 52572 2960 41m S 0.0 0.1 0:04.82 httpd 30117 apache 15 0 52464 4356 41m D 0.1 0.2 0:12.74 httpd 30135 apache 15 0 52392 3984 41m D 0.0 0.2 0:11.73 httpd 30126 apache 15 0 52380 4076 41m D 0.0 0.2 0:13.88 httpd 30133 apache 15 0 52340 2856 41m D 0.0 0.1 0:13.50 httpd 31136 apache 15 0 52316 2596 41m D 0.0 0.1 0:00.90 httpd 30127 apache 15 0 52312 3044 41m D 0.0 0.1 0:12.13 httpd 30136 apache 15 0 52208 2780 41m D 0.0 0.1 0:13.56 httpd 31138 apache 15 0 52116 3272 41m D 0.1 0.2 0:00.78 httpd 31137 apache 15 0 52068 2420 41m D 0.0 0.1 0:00.99 httpd 31289 apache 16 0 51476 1900 41m D 0.0 0.1 0:00.00 httpd 31273 apache 17 0 51360 2188 41m D 0.0 0.1 0:00.01 httpd 31261 apache 16 0 51252 1740 41m D 0.0 0.1 0:00.02 httpd 31234 apache 16 0 51220 1520 41m D 0.1 0.1 0:00.02 httpd 31208 apache 16 0 51220 1888 41m D 0.0 0.1 0:00.02 httpd 31276 apache 16 0 51144 1920 41m D 0.0 0.1 0:00.00 httpd 31274 apache 16 0 51144 1952 41m D 0.0 0.1 0:00.00 httpd 31258 apache 18 0 51144 2068 41m D 0.0 0.1 0:00.02 httpd 31255 apache 18 0 51144 2068 41m D 0.0 0.1 0:00.01 httpd 31254 apache 16 0 51144 2012 41m D 0.0 0.1 0:00.01 httpd 31252 apache 17 0 51144 1996 41m D 0.0 0.1 0:00.01 httpd 31251 apache 16 0 51144 2012 41m D 0.0 0.1 0:00.01 httpd 31238 apache 16 0 51144 2068 41m D 0.0 0.1 0:00.01 httpd 31212 apache 17 0 51144 2028 41m D 0.0 0.1 0:00.01 httpd 31288 apache 17 0 51140 2056 41m D 0.0 0.1 0:00.01 httpd 31287 apache 16 0 51140 2020 41m D 0.0 0.1 0:00.00 httpd 31227 apache 18 0 51140 2084 41m D 0.0 0.1 0:00.00 httpd 31201 apache 16 0 51140 2024 41m D 0.0 0.1 0:00.01 httpd 31225 apache 16 0 51136 1768 41m D 0.0 0.1 0:00.01 httpd 31300 apache 16 0 51132 1700 41m D 0.0 0.1 0:00.00 httpd 31285 apache 16 0 51132 2112 41m D 0.0 0.1 0:00.00 httpd 31283 apache 16 0 51132 1708 41m D 0.0 0.1 0:00.00 httpd 31280 apache 16 0 51132 1692 41m D 0.0 0.1 0:00.00 httpd 31272 apache 18 0 51132 1828 41m D 0.0 0.1 0:00.00 httpd 31257 apache 16 0 51132 2012 41m D 0.0 0.1 0:00.00 httpd 31207 apache 16 0 51132 1708 41m D 0.0 0.1 0:00.00 httpd 31243 apache 16 0 51128 1856 41m D 0.0 0.1 0:00.00 httpd 31296 apache 16 0 51120 1844 41m D 0.0 0.1 0:00.00 httpd 31295 apache 16 0 51120 1632 41m D 0.0 0.1 0:00.00 httpd 31284 apache 16 0 51120 1640 41m D 0.0 0.1 0:00.01 httpd 31277 apache 16 0 51120 1616 41m D 0.0 0.1 0:00.00 httpd 31271 apache 18 0 51120 1656 41m D 0.0 0.1 0:00.00 httpd 31220 apache 16 0 51120 1620 41m D 0.0 0.1 0:00.01 httpd 31206 apache 16 0 51108 1944 41m D 0.0 0.1 0:00.01 httpd 31249 apache 17 0 51104 1788 41m D 0.0 0.1 0:00.01 httpd 31237 apache 16 0 51104 1848 41m D 0.1 0.1 0:00.02 httpd 31253 apache 16 0 51100 2140 41m D 0.0 0.1 0:00.01 httpd 31203 apache 17 0 51100 1608 41m D 0.0 0.1 0:00.01 httpd 31211 apache 16 0 51096 2004 41m D 0.0 0.1 0:00.01 httpd 31298 apache 17 0 51092 2004 41m D 0.0 0.1 0:00.00 httpd 31282 apache 16 0 51092 2084 41m D 0.0 0.1 0:00.00 httpd 31267 apache 18 0 51092 2056 41m D 0.0 0.1 0:00.01 httpd 31313 apache 18 0 51088 1512 41m D 0.0 0.1 0:00.00 httpd 31312 apache 16 0 51088 1508 41m D 0.0 0.1 0:00.00 httpd 31310 apache 17 0 51088 1512 41m D 0.0 0.1 0:00.00 httpd 31286 apache 16 0 51088 1680 41m D 0.0 0.1 0:00.01 httpd 31281 apache 15 0 51088 1268 41m D 0.0 0.1 0:00.00 httpd 31269 apache 18 0 51088 1824 41m D 0.0 0.1 0:00.00 httpd 31268 apache 17 0 51088 1776 41m D 0.1 0.1 0:00.04 httpd 31248 apache 16 0 51088 1600 41m S 0.1 0.1 0:00.02 httpd 31242 apache 16 0 51088 1336 41m D 0.0 0.1 0:00.00 httpd 31241 apache 15 0 51088 1636 41m S 0.0 0.1 0:00.01 httpd 31236 apache 18 0 51088 1752 41m D 0.0 0.1 0:00.00 httpd 31233 apache 16 0 51088 1376 41m S 0.0 0.1 0:00.00 httpd 31231 apache 16 0 51088 1196 41m D 0.0 0.1 0:00.00 httpd 31217 apache 16 0 51088 1636 41m S 0.0 0.1 0:00.01 httpd 31214 apache 16 0 51088 1428 41m S 0.0 0.1 0:00.01 httpd 31210 apache 16 0 51088 1320 41m S 0.0 0.1 0:00.00 httpd 31205 apache 18 0 51088 1648 41m D 0.0 0.1 0:00.01 httpd 31204 apache 16 0 51088 1268 41m S 0.0 0.1 0:00.00 httpd 31235 apache 16 0 51080 1364 41m D 0.0 0.1 0:00.00 httpd 31232 apache 16 0 51080 1484 41m D 0.0 0.1 0:00.03 httpd 31219 apache 18 0 51080 1800 41m D 0.0 0.1 0:00.01 httpd 31315 apache 18 0 51076 1384 41m D 0.0 0.1 0:00.00 httpd 31314 apache 16 0 51076 1316 41m D 0.0 0.1 0:00.00 httpd 31311 apache 18 0 51076 1464 41m D 0.0 0.1 0:00.01 httpd 31309 apache 18 0 51076 1384 41m D 0.0 0.1 0:00.00 httpd 31308 apache 18 0 51076 1304 41m D 0.0 0.1 0:00.00 httpd 31306 apache 17 0 51076 1420 41m D 0.0 0.1 0:00.00 httpd 31305 apache 18 0 51076 1320 41m D 0.0 0.1 0:00.01 httpd 31304 apache 18 0 51076 1280 41m D 0.0 0.1 0:00.00 httpd 31303 apache 18 0 51076 1380 41m D 0.0 0.1 0:00.00 httpd 31302 apache 17 0 51076 1308 41m D 0.0 0.1 0:00.00 httpd 31301 apache 15 0 51076 1292 41m D 0.0 0.1 0:00.00 httpd 31297 apache 16 0 51076 1348 41m D 0.0 0.1 0:00.00 httpd 31279 apache 16 0 51076 1292 41m S 0.0 0.1 0:00.00 httpd 31278 apache 15 0 51076 1204 41m D 0.0 0.1 0:00.00 httpd 31275 apache 15 0 51076 1196 41m D 0.0 0.1 0:00.00 httpd 31260 apache 16 0 51076 1548 41m S 0.0 0.1 0:00.02 httpd 31259 apache 18 0 51076 1536 41m S 0.0 0.1 0:00.00 httpd 31256 apache 18 0 51076 1444 41m S 0.0 0.1 0:00.00 httpd 31250 apache 16 0 51076 1484 41m S 0.0 0.1 0:00.00 httpd 31247 apache 16 0 51076 1292 41m D 0.0 0.1 0:00.01 httpd 31246 apache 16 0 51076 1296 41m S 0.0 0.1 0:00.01 httpd 31245 apache 18 0 51076 1172 41m D 0.0 0.1 0:00.00 httpd 31244 apache 15 0 51076 1412 41m S 0.0 0.1 0:00.00 httpd 31240 apache 16 0 51076 1500 41m S 0.0 0.1 0:00.01 httpd 31239 apache 15 0 51076 1548 41m D 0.0 0.1 0:00.01 httpd 31230 apache 18 0 51076 1300 41m D 0.0 0.1 0:00.00 httpd 31229 apache 18 0 51076 1304 41m D 0.0 0.1 0:00.00 httpd 31228 apache 16 0 51076 1424 41m S 0.0 0.1 0:00.00 httpd 31226 apache 16 0 51076 1760 41m D 0.0 0.1 0:00.01 httpd 31223 apache 18 0 51076 1216 41m D 0.0 0.1 0:00.00 httpd 31218 apache 18 0 51076 1704 41m D 0.0 0.1 0:00.01 httpd 31216 apache 18 0 51076 1208 41m D 0.0 0.1 0:00.00 httpd 31215 apache 16 0 51076 1240 41m D 0.0 0.1 0:00.00 httpd 31202 apache 17 0 51076 1620 41m D 0.0 0.1 0:00.01 httpd 31325 root 17 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd 31324 root 17 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd 31323 root 15 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd 31322 root 15 0 51064 1320 41m D 0.0 0.1 0:00.00 httpd 31319 root 18 0 51064 1288 41m D 0.0 0.1 0:00.00 httpd 31318 root 17 0 51064 1312 41m D 0.0 0.1 0:00.00 httpd 31316 root 18 0 51064 1328 41m D 0.0 0.1 0:00.00 httpd 794 root 17 0 51064 1192 41m S 0.1 0.1 0:01.67 httpd 23885 pricemat 16 0 5652 1124 4892 S 0.0 0.1 0:00.02 php 23980 pricemat 17 0 5648 652 4892 S 0.0 0.0 0:00.01 php 1430 root 15 0 3780 260 3112 S 0.0 0.0 0:00.61 sshd 8273 root 15 0 3716 468 3112 S 0.0 0.0 0:01.26 sshd 994 root 16 0 3660 516 3112 S 0.0 0.0 0:10.24 sshd 2147 root 15 0 3572 176 3112 S 0.0 0.0 0:00.12 sshd 2129 root 16 0 3572 156 3112 S 0.0 0.0 0:00.76 sshd 1919 root 15 0 3532 128 3112 S 0.0 0.0 0:00.11 sshd 1480 root 16 0 3488 84 2224 S 0.0 0.0 0:00.83 bash 2991 rathamah 16 0 3336 420 2052 S 0.0 0.0 0:00.04 bash 1431 rathamah 16 0 2828 880 2052 S 0.0 0.0 0:00.04 bash 770 dnscache 15 0 2712 24 1412 S 0.0 0.0 0:17.59 dnscache 728 root 16 0 2672 176 2560 S 0.0 0.0 0:01.72 sshd 1001 rathamah 16 0 2588 48 2052 S 0.0 0.0 0:00.03 bash 2957 root 17 0 2388 40 1984 S 0.0 0.0 0:00.02 login 846 root 20 0 2284 252 2120 S 0.0 0.0 0:00.02 mysqld_safe 750 root 16 0 2212 44 1900 S 0.0 0.0 0:00.00 xinetd 1478 root 16 0 2148 216 1788 S 0.0 0.0 0:00.00 su 1062 rathamah 15 0 2132 508 1728 D 0.4 0.0 2:37.73 top 8278 rathamah 15 0 2028 560 1728 R 0.2 0.0 0:18.48 top 31292 mobilius 18 0 1972 124 1884 S 0.0 0.0 0:00.00 lacheck.sh 31263 mobilius 18 0 1972 112 1884 S 0.0 0.0 0:00.01 sh 2131 rathamah 15 0 1964 80 1884 S 0.0 0.0 0:04.17 proc_vmstat.sh 1192 apache 16 0 1952 44 1816 S 0.0 0.0 0:00.60 cache_clean 708 ntp 16 0 1936 1928 1792 S 0.0 0.1 0:00.10 ntpd 619 root 15 0 1840 304 1624 S 0.0 0.0 0:06.43 syslogd 837 root 15 0 1728 136 1536 S 0.0 0.0 0:00.14 crond 23884 root 16 0 1620 160 1536 S 0.0 0.0 0:00.00 crond 31264 root 18 0 1616 108 1536 S 0.0 0.0 0:00.00 crond 31262 root 17 0 1616 96 1536 S 0.0 0.0 0:00.00 crond 1088 root 23 0 1616 84 1536 S 0.0 0.0 0:00.00 crond 625 root 16 0 1532 188 1364 S 0.0 0.0 0:00.19 klogd 2149 rathamah 15 0 1464 88 1404 S 0.0 0.0 0:00.02 vmstat 863 qmails 15 0 1444 196 1388 S 0.0 0.0 0:45.57 qmail-send 31321 megashop 18 0 1436 200 1388 D 0.0 0.0 0:00.00 qmail-inject 31293 megashop 18 0 1436 200 1388 D 0.0 0.0 0:00.01 qmail-inject 23939 pricemat 16 0 1436 8 1388 S 0.0 0.0 0:00.00 qmail-inject 23 root 17 0 1436 432 1392 S 0.0 0.0 0:00.66 devfsd 31317 rathamah 17 0 1432 124 1420 D 0.0 0.0 0:00.01 date 1201 urs 15 0 1420 100 1392 D 0.0 0.0 0:00.54 tcpserver 1187 urs 18 0 1420 44 1392 S 0.0 0.0 0:00.00 tcpserver 767 root 15 0 1420 76 1356 S 0.0 0.0 0:00.12 svscan 1 root 15 0 1420 424 1372 D 0.0 0.0 0:04.26 init 866 qmaill 15 0 1412 292 1352 S 0.0 0.0 0:00.62 splogger 867 root 15 0 1404 96 1360 S 0.0 0.0 0:00.15 qmail-lspawn 868 qmailr 16 0 1400 96 1356 S 0.0 0.0 0:00.17 qmail-rspawn 771 dnslog 15 0 1400 16 1368 S 0.0 0.0 0:10.64 multilog 1261 root 16 0 1392 40 1352 S 0.0 0.0 0:00.00 mingetty 919 root 16 0 1392 36 1352 S 0.0 0.0 0:00.00 mingetty 916 root 16 0 1392 92 1352 S 0.0 0.0 0:00.00 mingetty 915 root 16 0 1392 64 1352 S 0.0 0.0 0:00.00 mingetty 914 root 16 0 1392 80 1352 S 0.0 0.0 0:00.00 mingetty 913 root 16 0 1392 76 1352 S 0.0 0.0 0:00.00 mingetty 869 qmailq 15 0 1388 84 1356 S 0.0 0.0 0:00.66 qmail-clean 769 root 16 0 1388 12 1360 S 0.0 0.0 0:00.00 supervise 768 root 16 0 1388 12 1360 S 0.0 0.0 0:00.00 supervise 31307 mobilius 18 0 376 116 348 D 0.0 0.0 0:00.00 awk 31320 root 15 0 0 0 0 D 0.0 0.0 0:00.00 pdflush 31222 root 15 0 0 0 0 D 0.0 0.0 0:00.01 pdflush 24026 root 15 0 0 0 0 D 0.0 0.0 0:05.24 pdflush 18 root 5 -10 0 0 0 S 0.0 0.0 0:00.04 reiserfs/1 17 root 5 -10 0 0 0 S 0.0 0.0 0:00.02 reiserfs/0 16 root 18 0 0 0 0 S 0.0 0.0 0:00.15 kseriod 15 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 aio/1 14 root 10 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0 13 root 15 0 0 0 0 D 8.9 0.0 0:23.43 kswapd0 10 root 15 0 0 0 0 S 0.0 0.0 0:00.00 kirqd 9 root 5 -10 0 0 0 S 0.0 0.0 0:00.01 kblockd/1 8 root 5 -10 0 0 0 S 0.0 0.0 0:00.01 kblockd/0 7 root 5 -10 0 0 0 S 0.0 0.0 0:00.02 events/1 6 root 5 -10 0 0 0 S 0.0 0.0 0:00.03 events/0 5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1 4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 > > Also, run > > while true > do > cat /proc/meminfo > sleep 10 > done > > and record the info which that leaves behind when the machine locks up. > This should tell us whether it is an application or kernel memory leak. If > it is indeed a leak. Will do this next time. > > > -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-24 11:54 ` Sergey S. Kostyliov @ 2004-02-26 12:19 ` Sergey S. Kostyliov 2004-02-26 12:53 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-26 12:19 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, gluk, anton, Mike Fedyk On Tuesday 24 February 2004 14:54, Sergey S. Kostyliov wrote: > On Tuesday 24 February 2004 01:26, Andrew Morton wrote: > > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > <cut> > > > > The memory exhaustion is indeed possible for this box. I'll double check > > > ulimit and /etc/security/limits.conf stuff. The only thing which worries > > > me that this box had been running for months without any problems with > > > 2.4.23aa1. > > > > It is conceivable that you have some application which runs OK on 2.4.x but > > has some subtle bug which causes the app to go crazy on a 2.6 kernel > > consuming lots of memory. Or there's a bug in the 2.6 kernel ;) > > > > > I have added another 2Gb to swap space (hope this give enough time > > > to find the memory hungry process(es)). > > <cut> > > > > > OK, so it's doing a lot of swapping and your swap utilisation is > > continuously increasing. I would suspect an application or kernel memory > > leak. > > > > I suggest you keep that `vmstat 30' running all the time. When the machine > > dies, take a look at the final 20 lines. > > Here is from the last lockup: Yet another lockup has just occurred. I could be wrong but from the /proc/meminfo content it doesn't looks like memory leak (neither kernel nor userspace), doesn't it? 1) 3 last /proc/meminfo before a hang: =============================== Thu Feb 26 04:58:34 MSK 2004 MemTotal: 2073868 kB MemFree: 7008 kB Buffers: 223100 kB Cached: 593368 kB SwapCached: 748824 kB Active: 1776280 kB Inactive: 226160 kB HighTotal: 1179648 kB HighFree: 2560 kB LowTotal: 894220 kB LowFree: 4448 kB SwapTotal: 3583968 kB SwapFree: 2675616 kB Dirty: 2156 kB Writeback: 0 kB Mapped: 1219740 kB Slab: 43668 kB Committed_AS: 1846968 kB PageTables: 4020 kB VmallocTotal: 114680 kB VmallocUsed: 7448 kB VmallocChunk: 107232 kB Thu Feb 26 04:59:05 MSK 2004 MemTotal: 2073868 kB MemFree: 3972 kB Buffers: 2268 kB Cached: 36132 kB SwapCached: 726940 kB Active: 1157256 kB Inactive: 3696 kB HighTotal: 1179648 kB HighFree: 704 kB LowTotal: 894220 kB LowFree: 3268 kB SwapTotal: 3583968 kB SwapFree: 2633444 kB Dirty: 20 kB Writeback: 3376 kB Mapped: 1154812 kB Slab: 27996 kB Committed_AS: 1851456 kB PageTables: 4052 kB VmallocTotal: 114680 kB VmallocUsed: 7448 kB VmallocChunk: 107232 kB Thu Feb 26 05:00:15 MSK 2004 MemTotal: 2073868 kB MemFree: 2528 kB Buffers: 2180 kB Cached: 34216 kB SwapCached: 643808 kB Active: 999316 kB Inactive: 12088 kB HighTotal: 1179648 kB HighFree: 576 kB LowTotal: 894220 kB LowFree: 1952 kB SwapTotal: 3583968 kB SwapFree: 2559796 kB Dirty: 0 kB Writeback: 3052 kB Mapped: 1001208 kB Slab: 23932 kB Committed_AS: 1979784 kB PageTables: 4840 kB VmallocTotal: 114680 kB VmallocUsed: 7448 kB VmallocChunk: 107232 kB 2) sysrq-M: =========== SysRq : Show Memory Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 2120kB (512kB HighMem) Active:1067 inactive:93 dirty:0 writeback:0 unstable:0 free:530 DMA free:176kB min:16kB low:32kB high:48kB active:884kB inactive:0kB Normal free:1432kB min:936kB low:1872kB high:2808kB active:1376kB inactive:372kB HighMem free:512kB min:512kB low:1024kB high:1536kB active:2008kB inactive:0kB DMA: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 176kB Normal: 248*4kB 3*8kB 0*16kB 1*32kB 6*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1432kB HighMem: 0*4kB 0*8kB 0*16kB 2*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB Swap cache: add 1726105, delete 1726052, find 1388170/1627421, race 19+488 Free swap: 2195688kB 524288 pages of RAM 294912 pages of HIGHMEM 5821 reserved pages 993 pages shared 54 pages swap cached 3) sysrq-T: =========== http://sysadminday.org.ru/2.6.3-lockup/20040226/sysrq-T 3) `vmstat 30': =============== procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 19 1255096 1952 1996 19920 426 1763 505 1778 1068 172 0 1 0 99 0 24 1260156 1944 2028 19816 374 1650 463 1670 1067 165 0 1 0 99 0 18 1266576 1880 2000 18960 372 1835 449 1847 1072 177 0 1 0 99 0 19 1274696 2904 1960 17892 366 2002 422 2007 1054 179 0 1 0 99 0 14 1279000 2896 1916 17356 203 1683 243 1693 1037 137 0 1 0 99 0 19 1288068 2472 1912 16608 180 2074 220 2085 1048 138 0 1 0 99 1 13 1294388 2152 1932 16404 253 1841 302 1849 1037 117 0 1 0 99 0 17 1301552 2328 1956 15684 318 1866 375 1880 1037 162 0 1 0 99 0 18 1307280 2448 1956 15024 331 1697 408 1714 1041 155 0 1 0 99 0 20 1312696 2184 1852 13948 480 1720 549 1732 1041 166 0 1 0 99 0 21 1321756 2308 1952 13400 435 2012 572 2028 1048 191 0 1 0 99 0 20 1330740 2372 1840 12152 509 1920 564 1939 1045 162 0 1 0 99 0 19 1336432 2616 1844 11252 513 1697 568 1704 1043 135 0 1 0 99 0 20 1342256 2364 1896 10704 520 1810 573 1816 1042 185 0 1 0 99 0 17 1350608 2868 1796 10112 368 2079 412 2092 1040 133 0 1 0 99 0 19 1356100 2176 1988 9120 401 1668 533 1677 1039 161 0 1 0 99 0 20 1359692 2248 2004 8876 369 1500 482 1514 1039 169 0 1 0 99 0 19 1364868 2696 1904 8428 455 1643 604 1658 1038 172 0 1 0 99 0 20 1371124 2876 1920 7212 537 2133 745 2147 1312 209 0 1 0 99 0 20 1378172 3192 1832 6036 614 1623 793 1631 1042 180 0 1 0 99 -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 12:19 ` Sergey S. Kostyliov @ 2004-02-26 12:53 ` Andrew Morton 2004-02-26 13:11 ` Andrew Morton 2004-02-26 14:30 ` Sergey S. Kostyliov 0 siblings, 2 replies; 24+ messages in thread From: Andrew Morton @ 2004-02-26 12:53 UTC (permalink / raw) To: Sergey S. Kostyliov; +Cc: linux-kernel, gluk, anton, mfedyk "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > Yet another lockup has just occurred. I could be wrong but from the > /proc/meminfo content it doesn't looks like memory leak (neither kernel > nor userspace), doesn't it? I think it's a kernel leak. > Thu Feb 26 05:00:15 MSK 2004 > MemTotal: 2073868 kB > MemFree: 2528 kB > Buffers: 2180 kB > Cached: 34216 kB > SwapCached: 643808 kB > Active: 999316 kB > Inactive: 12088 kB > HighTotal: 1179648 kB > HighFree: 576 kB > LowTotal: 894220 kB > LowFree: 1952 kB > SwapTotal: 3583968 kB > SwapFree: 2559796 kB > Dirty: 0 kB > Writeback: 3052 kB > Mapped: 1001208 kB > Slab: 23932 kB > Committed_AS: 1979784 kB > PageTables: 4840 kB > VmallocTotal: 114680 kB > VmallocUsed: 7448 kB > VmallocChunk: 107232 kB A gig of mapped memory, most of it in swapcache. That's probably all highmem. Only a gig of memory on the page LRU. Where is the rest? Lost. Almost no pagecache at all, slab is small. > 3) sysrq-T: > =========== > http://sysadminday.org.ru/2.6.3-lockup/20040226/sysrq-T hm, you have 34 instances of crond running. How odd. > 3) `vmstat 30': > =============== > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 0 19 1255096 1952 1996 19920 426 1763 505 1778 1068 172 0 1 0 99 > 0 24 1260156 1944 2028 19816 374 1650 463 1670 1067 165 0 1 0 99 Again, all your memory has vanished. I'd say that we've leaked everything in lowmem and everyone is stuck trying to reclaim some lowmem memory. Not sure why the oom-killer didn't do anything. I haven't tested it in a year - maybe it broke. So. What are you using which is different from everyone else? DAC960 I see. What about firewall setups, NIC drivers, RAID/MD/etc? Anything in there which isn't a mainstream thing? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 12:53 ` Andrew Morton @ 2004-02-26 13:11 ` Andrew Morton 2004-02-26 14:37 ` Dave Jones 2004-02-26 14:30 ` Sergey S. Kostyliov 1 sibling, 1 reply; 24+ messages in thread From: Andrew Morton @ 2004-02-26 13:11 UTC (permalink / raw) To: rathamahata, linux-kernel, gluk, anton, mfedyk Andrew Morton <akpm@osdl.org> wrote: > > Not sure why the oom-killer didn't do anything. There's still free swap space. The oom-killer has problems. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 13:11 ` Andrew Morton @ 2004-02-26 14:37 ` Dave Jones 2004-02-26 15:37 ` Arjan van de Ven 0 siblings, 1 reply; 24+ messages in thread From: Dave Jones @ 2004-02-26 14:37 UTC (permalink / raw) To: Andrew Morton; +Cc: rathamahata, linux-kernel, gluk, anton, mfedyk On Thu, Feb 26, 2004 at 05:11:35AM -0800, Andrew Morton wrote: > Andrew Morton <akpm@osdl.org> wrote: > > > > Not sure why the oom-killer didn't do anything. > > There's still free swap space. The oom-killer has problems. That sounds odd. Surely if we have free swap, we don't want the oom-killer to do anything ? Dave ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 14:37 ` Dave Jones @ 2004-02-26 15:37 ` Arjan van de Ven 0 siblings, 0 replies; 24+ messages in thread From: Arjan van de Ven @ 2004-02-26 15:37 UTC (permalink / raw) To: Dave Jones; +Cc: Andrew Morton, rathamahata, linux-kernel, gluk, anton, mfedyk [-- Attachment #1: Type: text/plain, Size: 560 bytes --] On Thu, 2004-02-26 at 15:37, Dave Jones wrote: > On Thu, Feb 26, 2004 at 05:11:35AM -0800, Andrew Morton wrote: > > Andrew Morton <akpm@osdl.org> wrote: > > > > > > Not sure why the oom-killer didn't do anything. > > > > There's still free swap space. The oom-killer has problems. > > That sounds odd. Surely if we have free swap, we don't > want the oom-killer to do anything ? with highmem it's not so easy :) the lowzone can be entirely pinned by pagetables and such and the highmem zone can be free... and still you want to oomkill. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 12:53 ` Andrew Morton 2004-02-26 13:11 ` Andrew Morton @ 2004-02-26 14:30 ` Sergey S. Kostyliov 2004-02-26 20:03 ` Andrew Morton 1 sibling, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-26 14:30 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, gluk, anton, mfedyk On Thursday 26 February 2004 15:53, Andrew Morton wrote: > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > Yet another lockup has just occurred. I could be wrong but from the > > /proc/meminfo content it doesn't looks like memory leak (neither kernel > > nor userspace), doesn't it? > > I think it's a kernel leak. > > > Thu Feb 26 05:00:15 MSK 2004 > > MemTotal: 2073868 kB > > MemFree: 2528 kB > > Buffers: 2180 kB > > Cached: 34216 kB > > SwapCached: 643808 kB > > Active: 999316 kB > > Inactive: 12088 kB > > HighTotal: 1179648 kB > > HighFree: 576 kB > > LowTotal: 894220 kB > > LowFree: 1952 kB > > SwapTotal: 3583968 kB > > SwapFree: 2559796 kB > > Dirty: 0 kB > > Writeback: 3052 kB > > Mapped: 1001208 kB > > Slab: 23932 kB > > Committed_AS: 1979784 kB > > PageTables: 4840 kB > > VmallocTotal: 114680 kB > > VmallocUsed: 7448 kB > > VmallocChunk: 107232 kB > > A gig of mapped memory, most of it in swapcache. That's probably all > highmem. Only a gig of memory on the page LRU. Where is the rest? Lost. > > Almost no pagecache at all, slab is small. > > > 3) sysrq-T: > > =========== > > http://sysadminday.org.ru/2.6.3-lockup/20040226/sysrq-T > > hm, you have 34 instances of crond running. How odd. > > > 3) `vmstat 30': > > =============== > > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- > > r b swpd free buff cache si so bi bo in cs us sy id wa > > 0 19 1255096 1952 1996 19920 426 1763 505 1778 1068 172 0 1 0 99 > > 0 24 1260156 1944 2028 19816 374 1650 463 1670 1067 165 0 1 0 99 > > Again, all your memory has vanished. > > I'd say that we've leaked everything in lowmem and everyone is stuck trying > to reclaim some lowmem memory. Not sure why the oom-killer didn't do > anything. I haven't tested it in a year - maybe it broke. > > So. What are you using which is different from everyone else? DAC960 I > see. What about firewall setups, NIC drivers, RAID/MD/etc? Anything in > there which isn't a mainstream thing? Iptables (ipt_REJECT,ipt_state,ip_conntrack,ipt_state,iptable_filter modules) is used as firewall. I think NICs are pretty usual: 00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08) 00:05.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08) handled by Intel e100 driver. Only plain partitions (there is no md, dm or something like this): [rathamahata@ope rathamahata]$ mount /dev/rd/host0/target0/part1 on / type reiserfs (rw) none on /proc type proc (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) /dev/rd/host0/target1/part2 on /usr/local type reiserfs (rw) /dev/rd/host0/target3/part1 on /var type reiserfs (rw,noatime,nodiratime) /dev/rd/host0/target7/part1 on /var/www/html/fo type reiserfs (rw,noatime,nodiratime) /dev/rd/host0/target2/part1 on /home type reiserfs (rw,noatime,nodiratime) /dev/rd/host0/target4/part1 on /var/lib/innodb/1 type reiserfs (rw,noatime,nodiratime,notail) /dev/rd/host0/target5/part1 on /var/lib/innodb/2 type reiserfs (rw,noatime,nodiratime,notail) /dev/rd/host0/target6/part1 on /var/lib/oracle/db04 type reiserfs (rw,noatime,nodiratime,notail) sysfs on /sys type sysfs (rw) Here is a .config: CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_STANDALONE=y CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_LOG_BUF_SHIFT=15 CONFIG_KALLSYMS=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y CONFIG_IOSCHED_DEADLINE=y CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_OBSOLETE_MODPARM=y CONFIG_KMOD=y CONFIG_X86_PC=y CONFIG_MPENTIUMIII=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_SMP=y CONFIG_NR_CPUS=2 CONFIG_PREEMPT=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_MICROCODE=m CONFIG_X86_MSR=m CONFIG_X86_CPUID=m CONFIG_HIGHMEM4G=y CONFIG_HIGHMEM=y CONFIG_MTRR=y CONFIG_HAVE_DEC_LOCK=y CONFIG_PM=y CONFIG_ACPI_BOOT=y CONFIG_APM=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_NAMES=y CONFIG_BINFMT_ELF=y CONFIG_BLK_DEV_DAC960=y CONFIG_BLK_DEV_LOOP=m CONFIG_NET=y CONFIG_PACKET=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_NETFILTER=y CONFIG_IP_NF_CONNTRACK=m CONFIG_IP_NF_FTP=m CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_LIMIT=m CONFIG_IP_NF_MATCH_IPRANGE=m CONFIG_IP_NF_MATCH_MAC=m CONFIG_IP_NF_MATCH_PKTTYPE=m CONFIG_IP_NF_MATCH_MARK=m CONFIG_IP_NF_MATCH_MULTIPORT=m CONFIG_IP_NF_MATCH_TOS=m CONFIG_IP_NF_MATCH_RECENT=m CONFIG_IP_NF_MATCH_ECN=m CONFIG_IP_NF_MATCH_DSCP=m CONFIG_IP_NF_MATCH_LENGTH=m CONFIG_IP_NF_MATCH_TTL=m CONFIG_IP_NF_MATCH_TCPMSS=m CONFIG_IP_NF_MATCH_HELPER=m CONFIG_IP_NF_MATCH_STATE=m CONFIG_IP_NF_MATCH_CONNTRACK=m CONFIG_IP_NF_MATCH_OWNER=m CONFIG_IP_NF_FILTER=m CONFIG_IP_NF_TARGET_REJECT=m CONFIG_IP_NF_MANGLE=m CONFIG_IP_NF_TARGET_TOS=m CONFIG_IP_NF_TARGET_ECN=m CONFIG_IP_NF_TARGET_MARK=m CONFIG_IP_NF_TARGET_CLASSIFY=m CONFIG_IP_NF_TARGET_LOG=m CONFIG_IP_NF_TARGET_ULOG=m CONFIG_IP_NF_TARGET_TCPMSS=m CONFIG_IPV6_SCTP__=y CONFIG_NETDEVICES=y CONFIG_DUMMY=m CONFIG_NET_ETHERNET=y CONFIG_MII=y CONFIG_NET_PCI=y CONFIG_E100=y CONFIG_INPUT=y CONFIG_INPUT_MOUSEDEV=y CONFIG_INPUT_MOUSEDEV_PSAUX=y CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 CONFIG_SOUND_GAMEPORT=y CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_NR_UARTS=4 CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_UNIX98_PTY_COUNT=256 CONFIG_VIDEO_SELECT=y CONFIG_VGA_CONSOLE=y CONFIG_DUMMY_CONSOLE=y CONFIG_EXT2_FS=m CONFIG_EXT3_FS=m CONFIG_EXT3_FS_XATTR=y CONFIG_JBD=m CONFIG_FS_MBCACHE=m CONFIG_REISERFS_FS=y CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_DEVFS_FS=y CONFIG_DEVPTS_FS=y CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_MSDOS_PARTITION=y CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_ISO8859_1=y CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y CONFIG_X86_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y CONFIG_PC=y > > > -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 14:30 ` Sergey S. Kostyliov @ 2004-02-26 20:03 ` Andrew Morton 2004-02-28 14:56 ` Sergey S. Kostyliov 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2004-02-26 20:03 UTC (permalink / raw) To: Sergey S. Kostyliov; +Cc: linux-kernel, gluk, anton, mfedyk "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > So. What are you using which is different from everyone else? DAC960 I > > see. What about firewall setups, NIC drivers, RAID/MD/etc? Anything in > > there which isn't a mainstream thing? > > Iptables (ipt_REJECT,ipt_state,ip_conntrack,ipt_state,iptable_filter modules) > is used as firewall. > > I think NICs are pretty usual: > 00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08) > 00:05.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08) > handled by Intel e100 driver. > > Only plain partitions (there is no md, dm or something like this): > [rathamahata@ope rathamahata]$ mount > /dev/rd/host0/target0/part1 on / type reiserfs (rw) > none on /proc type proc (rw) > none on /dev/pts type devpts (rw,gid=5,mode=620) > /dev/rd/host0/target1/part2 on /usr/local type reiserfs (rw) > /dev/rd/host0/target3/part1 on /var type reiserfs (rw,noatime,nodiratime) > /dev/rd/host0/target7/part1 on /var/www/html/fo type reiserfs (rw,noatime,nodiratime) > /dev/rd/host0/target2/part1 on /home type reiserfs (rw,noatime,nodiratime) > /dev/rd/host0/target4/part1 on /var/lib/innodb/1 type reiserfs (rw,noatime,nodiratime,notail) > /dev/rd/host0/target5/part1 on /var/lib/innodb/2 type reiserfs (rw,noatime,nodiratime,notail) > /dev/rd/host0/target6/part1 on /var/lib/oracle/db04 type reiserfs (rw,noatime,nodiratime,notail) > sysfs on /sys type sysfs (rw) OK, thanks. Is there any possibility that you can run without iptables for a while, see if that fixes it? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.1 IO lockup on SMP systems 2004-02-26 20:03 ` Andrew Morton @ 2004-02-28 14:56 ` Sergey S. Kostyliov 2004-04-08 9:08 ` 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) Sergey S. Kostyliov 0 siblings, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-02-28 14:56 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, gluk, anton, mfedyk On Thursday 26 February 2004 23:03, Andrew Morton wrote: <cut> > OK, thanks. Is there any possibility that you can run without iptables for > a while, see if that fixes it? I recompiled 2.6.3 without iptables support, unfortunately it doesn't solve the problem, machine still hangs. 1) sysrq-M: SysRq : Show Memory Mem-info: DMA per-cpu: cpu 0 hot: low 2, high 6, batch 1 cpu 0 cold: low 0, high 2, batch 1 cpu 1 hot: low 2, high 6, batch 1 cpu 1 cold: low 0, high 2, batch 1 Normal per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 HighMem per-cpu: cpu 0 hot: low 32, high 96, batch 16 cpu 0 cold: low 0, high 32, batch 16 cpu 1 hot: low 32, high 96, batch 16 cpu 1 cold: low 0, high 32, batch 16 Free pages: 3276kB (512kB HighMem) Active:820 inactive:195 dirty:0 writeback:0 unstable:0 free:819 DMA free:1348kB min:16kB low:32kB high:48kB active:316kB inactive:0kB Normal free:1416kB min:936kB low:1872kB high:2808kB active:1388kB inactive:348kB HighMem free:512kB min:512kB low:1024kB high:1536kB active:1604kB inactive:404kB DMA: 75*4kB 69*8kB 21*16kB 3*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1348kB Normal: 98*4kB 20*8kB 0*16kB 1*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1416kB HighMem: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 2*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 512kB Swap cache: add 342862, delete 342774, find 15349/23980, race 14+29 Free swap: 2473044kB 524288 pages of RAM 294912 pages of HIGHMEM 5814 reserved pages 899 pages shared 89 pages swap cached 2) /proc/meminfo before a lockup Sat Feb 28 06:42:33 MSK 2004 MemTotal: 2073896 kB MemFree: 3452 kB Buffers: 2240 kB Cached: 29648 kB SwapCached: 21084 kB Active: 627896 kB Inactive: 17340 kB HighTotal: 1179648 kB HighFree: 576 kB LowTotal: 894248 kB LowFree: 2876 kB SwapTotal: 3583968 kB SwapFree: 3095996 kB Dirty: 0 kB Writeback: 14104 kB Mapped: 625540 kB Slab: 19044 kB Committed_AS: 1767368 kB PageTables: 4316 kB VmallocTotal: 114680 kB VmallocUsed: 7448 kB VmallocChunk: 107232 kB -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) 2004-02-28 14:56 ` Sergey S. Kostyliov @ 2004-04-08 9:08 ` Sergey S. Kostyliov 2004-04-09 7:17 ` 2.6.X kernel memory leak? Sergey S. Kostyliov 0 siblings, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-04-08 9:08 UTC (permalink / raw) To: linux-kernel; +Cc: Anton Kovalenko Hello all, On Saturday 28 February 2004 17:56, Sergey S. Kostyliov wrote: > On Thursday 26 February 2004 23:03, Andrew Morton wrote: > <cut> > > OK, thanks. Is there any possibility that you can run without iptables for > > a while, see if that fixes it? > > I recompiled 2.6.3 without iptables support, unfortunately it doesn't > solve the problem, machine still hangs. It looks like problem hasn't gone away in the last kernels. The visible symptoms haven't changed: machine is pingable, tcp ports which were in LISTEN state remains to be in LISTEN after lockup, nothing else. The last one is for different machine than in my previous reports, so I suspect this is not a hardware issue. Kernel is 2.6.5-aa3 but I believe Andrea's changes is not related to this problem. sysrq-M http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-M sysrq-T http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-T .config http://sysadminday.org.ru/2.6.X-lockup/terror/.config `lspci -vv' http://sysadminday.org.ru/2.6.X-lockup/terror/lspci_-vv `dmesg' http://sysadminday.org.ru/2.6.X-lockup/terror/dmesg /etc/fstab http://sysadminday.org.ru/2.6.X-lockup/terror/fstab -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.X kernel memory leak? 2004-04-08 9:08 ` 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) Sergey S. Kostyliov @ 2004-04-09 7:17 ` Sergey S. Kostyliov 2004-04-09 9:09 ` Andrew Morton 0 siblings, 1 reply; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-04-09 7:17 UTC (permalink / raw) To: linux-kernel; +Cc: Anton Kovalenko On Thursday 08 April 2004 13:08, Sergey S. Kostyliov wrote: > Hello all, > > On Saturday 28 February 2004 17:56, Sergey S. Kostyliov wrote: > > On Thursday 26 February 2004 23:03, Andrew Morton wrote: > > <cut> > > > OK, thanks. Is there any possibility that you can run without iptables for > > > a while, see if that fixes it? > > > > I recompiled 2.6.3 without iptables support, unfortunately it doesn't > > solve the problem, machine still hangs. > > It looks like problem hasn't gone away in the last kernels. The visible > symptoms haven't changed: machine is pingable, tcp ports which were in > LISTEN state remains to be in LISTEN after lockup, nothing else. > > The last one is for different machine than in my previous reports, > so I suspect this is not a hardware issue. Kernel is 2.6.5-aa3 but > I believe Andrea's changes is not related to this problem. > > sysrq-M > http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-M > > sysrq-T > http://sysadminday.org.ru/2.6.X-lockup/terror/20040408/sysrq-T > > .config > http://sysadminday.org.ru/2.6.X-lockup/terror/.config > > `lspci -vv' > http://sysadminday.org.ru/2.6.X-lockup/terror/lspci_-vv > > `dmesg' > http://sysadminday.org.ru/2.6.X-lockup/terror/dmesg > > /etc/fstab > http://sysadminday.org.ru/2.6.X-lockup/terror/fstab > > And here is part of sysrq-T for the third machine, which have just locked up, kernel is 2.6.5-rc3-aa2. multilog S F7BF3D60 0 3302 3288 (NOTLB) f7b83ed8 00000082 00000001 f7bf3d60 f7b83e9c c011a771 f7a4db80 00000000 00000003 f7bf3d58 f7b82000 00000282 f7aaece0 00000000 0804ea70 f7aaece0 f7aaed00 c180dbe0 0000111c 19e0b9c6 0001faed f7a89a70 f7b83f00 f7a6bb80 Call Trace: [<c011a771>] __wake_up_common+0x31/0x60 [<c016ee7c>] pipe_wait+0x7c/0xa0 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c016f07a>] pipe_readv+0x1da/0x2c0 [<c016f180>] pipe_read+0x20/0x30 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 qmail-lspawn S C030D340 0 3325 3301 3326 (NOTLB) f74c5ea4 00000082 c0117444 c030d340 00000246 01470f60 f7cd8b80 c030d6d0 00000000 c030d6c0 c1382d20 00000000 00000000 19c98941 0001faed f7aaece0 f7aaed00 c1815be0 00004ec0 19ca1051 0001faed f7bb3a10 00000010 f74c5eb4 Call Trace: [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0175a90>] __pollwait+0x80/0xd0 [<c016f5d2>] pipe_poll+0x32/0x90 [<c0175da2>] do_select+0x1c2/0x330 [<c0175a10>] __pollwait+0x0/0xd0 [<c017620e>] sys_select+0x2de/0x4d0 [<c016030f>] filp_close+0x4f/0x80 [<c01073c9>] sysenter_past_esp+0x52/0x71 qmail-rspawn S C030D300 0 3326 3301 3327 3325 (NOTLB) f74d9ea4 00000082 f74d8000 c030d300 00000246 01468f60 f7a9eb80 c181756c f74d9e58 c030d680 c11654c0 00000000 00000000 c0118397 00000000 f7aaece0 f7aaed00 c180dbe0 0000f336 ad9386e9 000010b2 f747bad0 cbf9ff0c f74d9eb4 Call Trace: [<c0118397>] recalc_task_prio+0x97/0x1c0 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0175a90>] __pollwait+0x80/0xd0 [<c016f5d2>] pipe_poll+0x32/0x90 [<c0175da2>] do_select+0x1c2/0x330 [<c0175a10>] __pollwait+0x0/0xd0 [<c017620e>] sys_select+0x2de/0x4d0 [<c016030f>] filp_close+0x4f/0x80 [<c01073c9>] sysenter_past_esp+0x52/0x71 qmail-clean S 00000012 0 3327 3301 3326 (NOTLB) f7445ed8 00000082 f7445f00 00000012 c01bfa2f 00000000 f7a9e280 f7445ea8 c0118397 b1f8808e 3cc9b81f f7a9e940 19ec7e20 0001faed c180dbe0 e8af92d0 e8af92f0 c1815be0 00008b3e 19ecb28e 0001faed f747b500 00000082 f74dbf00 Call Trace: [<c01bfa2f>] do_journal_end+0xcf/0xbe0 [<c0118397>] recalc_task_prio+0x97/0x1c0 [<c016ee7c>] pipe_wait+0x7c/0xa0 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c016f07a>] pipe_readv+0x1da/0x2c0 [<c016f42d>] pipe_writev+0x29d/0x360 [<c016f180>] pipe_read+0x20/0x30 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 proftpd D 00000000 0 3328 1 3364 3282 (NOTLB) f7413d34 00000086 00000000 00000000 00000000 00000000 f7a9edc0 00000000 00000000 00000000 00000000 00000000 f7412000 00000000 00000246 f73d0da0 f73d0dc0 c180dbe0 000005ed 980d0112 000222af f747af30 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c0144e8d>] do_page_cache_readahead+0x1cd/0x280 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c013e49f>] filemap_nopage+0x17f/0x460 [<c015004b>] do_no_page+0xdb/0x680 [<c013cc31>] unlock_page+0x11/0x60 [<c014f435>] do_wp_page+0x4c5/0x570 [<c015081c>] handle_mm_fault+0xec/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c010f555>] convert_fxsr_from_user+0x15/0xe0 [<c010f92c>] restore_i387+0x8c/0x90 [<c01066b4>] restore_sigcontext+0x114/0x130 [<c01067b2>] sys_sigreturn+0xe2/0x150 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 sshd D 00000000 0 3364 1 3238 3391 3328 (NOTLB) f73f7d18 00000082 00000000 00000000 00000000 00000000 f7a5d040 00000000 00000000 00000000 00000000 00000000 f73f6000 00000000 00000246 00000000 ffffffff c1815be0 00000149 de751563 000222af f740cf50 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0175a90>] __pollwait+0x80/0xd0 [<c028bd2d>] tcp_poll+0x1d/0x170 [<c0175a04>] do_select+0x1e7/0x330 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c017007b>] get_write_access+0x4b/0xe0 [<c01cf8e0>] __copy_to_user_ll+0x40/0x60 [<c01762e7>] sys_select+0x3b7/0x4d0 [<c010f92c>] restore_i387+0x8c/0x90 [<c01066b4>] restore_sigcontext+0x114/0x130 [<c01073c9>] sysenter_past_esp+0x52/0x71 cron D 00000000 0 3391 1 6677 3401 3364 (NOTLB) f73cbd34 00000082 00000000 00000000 00000000 00000000 f7bd5280 00000000 00000000 00000000 00000000 00000000 f73ca000 00000000 00000246 00000000 ffffffff c180dbe0 00000180 980ff35e 000222af f73d0f70 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c011a2c9>] schedule+0x389/0x7a0 [<c0144e8d>] do_page_cache_readahead+0x1cd/0x280 [<c013e49f>] filemap_nopage+0x17f/0x460 [<c015004b>] do_no_page+0xdb/0x680 [<c015081c>] handle_mm_fault+0xec/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c016c10b>] sys_stat64+0x2b/0x30 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 agetty D 00000000 0 3401 1 3402 3391 (NOTLB) f7badc54 00000086 00000000 00000000 00000000 00000000 f7b07dc0 00000000 00000000 00000000 00000000 00000000 f7bac000 c180e540 c01287ac 00000000 ffffffff c180dbe0 00018704 9823a1c3 000222af f7a88ed0 00000000 c030dc20 Call Trace: [<c01287ac>] __mod_timer+0x23c/0x370 [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c011a2c9>] schedule+0x389/0x7a0 [<c01bee6f>] journal_end+0xf/0x20 [<c01aeac7>] reiserfs_dirty_inode+0x77/0x110 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c025007b>] sg_res_in_use+0x6b/0x80 [<c01cf8e0>] __copy_to_user_ll+0x40/0x60 [<c01fa91d>] read_chan+0x5dd/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c014e246>] unmap_vmas+0xf6/0x310 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 agetty S F7D2E800 0 3402 1 3403 3401 (NOTLB) f7a87e58 00000082 00000000 f7d2e800 f7a87e20 f78200d8 f7b07b80 f7820114 c01bee6f 00000000 c01aeac7 000000ff 00000000 c02d88f7 00000000 00000001 0064d901 c180dbe0 000850e1 ef9caa45 00000013 f7a414e0 00000286 f7d2e800 Call Trace: [<c01bee6f>] journal_end+0xf/0x20 [<c01aeac7>] reiserfs_dirty_inode+0x77/0x110 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0205ba3>] do_con_write+0x2b3/0x740 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c014e246>] unmap_vmas+0xf6/0x310 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 agetty S 00003500 0 3403 1 3404 3402 (NOTLB) f73e5e58 00000082 00000000 00003500 175c6fc1 f7de0844 f7b07940 e05da8c0 00000011 00000000 f7de9220 c192f000 c0283e93 c192f000 00000000 f7de9220 f7de0830 c180dbe0 000b78d0 ef8aa30a 00000013 f740c980 00000286 f7d2e800 Call Trace: [<c0283e93>] ip_local_deliver+0xd3/0x1f0 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0205ba3>] do_con_write+0x2b3/0x740 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c014e246>] unmap_vmas+0xf6/0x310 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 agetty S 00000000 0 3404 1 3405 3403 (NOTLB) f74c3e58 00000082 000001ff 00000000 00000003 00000000 f7a4d280 00000000 00020000 00000000 f74c3e6c 000000ff 00000000 00000000 00000000 00000003 00000286 c1815be0 0007e435 ef918c51 00000013 f7bb3440 00000286 00000000 Call Trace: [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0205ba3>] do_con_write+0x2b3/0x740 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c014e246>] unmap_vmas+0xf6/0x310 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 agetty S 00000000 0 3405 1 3406 3404 (NOTLB) f73cfe58 00000086 000001ff 00000000 00000004 00000000 f7a6a4c0 00000000 00020000 00000000 f73cfe6c 000000ff 00000000 00000000 00000000 00000004 00000286 c180dbe0 00084d71 efb34714 00000013 f73d1b10 00000286 00000000 Call Trace: [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0205ba3>] do_con_write+0x2b3/0x740 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c014e246>] unmap_vmas+0xf6/0x310 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 agetty S F7D2E800 0 3406 1 4611 3405 (NOTLB) f73e3e58 00000082 00000000 f7d2e800 f73e3e20 f78200d8 f7cd8040 f7820114 c01bee6f 00000000 c01aeac7 000000ff 00000000 c02d88f7 00000000 00000001 0064d901 c1815be0 0007cc5c efa78898 00000013 f740c3b0 00000286 f7d2e800 Call Trace: [<c01bee6f>] journal_end+0xf/0x20 [<c01aeac7>] reiserfs_dirty_inode+0x77/0x110 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0205ba3>] do_con_write+0x2b3/0x740 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c014e246>] unmap_vmas+0xf6/0x310 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 ntpd D 00000000 0 4611 1 3406 (NOTLB) ce397bd0 00000082 00000000 00000000 00000000 00000000 f7bd5b80 00000000 00000000 00000000 00000000 00000000 ce396000 00000000 00000246 f7bb2100 f7bb2120 c1815be0 0000018d f19a9e2b 000222af cc3de3b0 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c013cf9d>] find_lock_page+0x4d/0x270 [<c013f42c>] generic_file_aio_write_nolock+0x33c/0xba0 [<c0148054>] mark_page_accessed+0x34/0x40 [<c0164052>] __find_get_block+0x62/0xc0 [<c0164052>] __find_get_block+0x62/0xc0 [<c01b6c92>] search_by_key+0x642/0xe10 [<c013fced>] generic_file_write_nolock+0x5d/0x80 [<c019fa27>] reiserfs_find_entry+0x97/0x150 [<c013fddf>] generic_file_write+0x3f/0x60 [<c01aa31f>] reiserfs_file_write+0x7ff/0x810 [<c019fbfc>] reiserfs_lookup+0x11c/0x1f0 [<c0130dd9>] in_group_p+0x39/0x70 [<c016ff29>] vfs_permission+0x79/0x140 [<c017a24c>] dput+0x1c/0x3a0 [<c01701fa>] path_release+0xa/0x30 [<c0171be7>] open_namei+0xb7/0x3e0 [<c015fc6d>] filp_open+0x2d/0x60 [<c0160e80>] vfs_write+0xb0/0x110 [<c0160f78>] sys_write+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 httpd D 00000000 0 12739 3100 12740 (NOTLB) ee10dc70 00000086 00000000 00000000 00000000 00000000 f7a9c4c0 00000000 00000000 00000000 00000000 00000000 ee10c000 9821ec28 000222af eff200a0 eff200c0 c180dbe0 00000644 9821eff8 000222af f714e3f0 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02688fa>] lock_sock+0x6a/0xc0 [<c0268e09>] __kfree_skb+0x79/0x100 [<c02905fc>] wait_for_connect+0xec/0x110 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c01cf529>] __get_user_4+0x11/0x17 [<c0264955>] move_addr_to_user+0x25/0x90 [<c017dab0>] new_inode+0x10/0xc0 [<c0265f7c>] sys_accept+0xec/0x160 [<c028fd7b>] tcp_close+0x36b/0x720 [<c0266b05>] sys_socketcall+0xf5/0x2a0 [<c01073c9>] sysenter_past_esp+0x52/0x71 httpd D 00000000 0 12740 3100 12741 12739 (NOTLB) dc433c70 00000086 00000000 00000000 00000000 00000000 e45efdc0 00000000 00000000 00000000 00000000 00000000 dc432000 00000000 00000246 f714e220 f714e240 c180dbe0 0000022d 981ea476 000222af eff219b0 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02688fa>] lock_sock+0x6a/0xc0 [<c0268e09>] __kfree_skb+0x79/0x100 [<c02905fc>] wait_for_connect+0xec/0x110 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c01cf529>] __get_user_4+0x11/0x17 [<c0264955>] move_addr_to_user+0x25/0x90 [<c017dab0>] new_inode+0x10/0xc0 [<c0265f7c>] sys_accept+0xec/0x160 [<c028fd7b>] tcp_close+0x36b/0x720 [<c0266b05>] sys_socketcall+0xf5/0x2a0 [<c01073c9>] sysenter_past_esp+0x52/0x71 httpd D 00000000 0 12741 3100 12742 12740 (NOTLB) f580dc70 00000086 00000000 00000000 00000000 00000000 f7a6a700 00000000 00000000 00000000 00000000 00000000 f580c000 00000000 00000246 00000000 ffffffff c1815be0 00000182 fdf459c5 000222af f7a63a90 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02688fa>] lock_sock+0x6a/0xc0 [<c0268e09>] __kfree_skb+0x79/0x100 [<c02905fc>] wait_for_connect+0xec/0x110 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c01cf529>] __get_user_4+0x11/0x17 [<c0264955>] move_addr_to_user+0x25/0x90 [<c017dab0>] new_inode+0x10/0xc0 [<c0265f7c>] sys_accept+0xec/0x160 [<c028fd7b>] tcp_close+0x36b/0x720 [<c0266b05>] sys_socketcall+0xf5/0x2a0 [<c01073c9>] sysenter_past_esp+0x52/0x71 httpd R running 0 12742 3100 12743 12741 (NOTLB) httpd D 00000000 0 12743 3100 13713 12742 (NOTLB) f7501c70 00000086 00000000 00000000 00000000 00000000 f7a9e700 00000000 00000000 00000000 00000000 00000000 f7500000 00000000 00000246 f73d07d0 f73d07f0 c1815be0 0000015c 0315f596 000222b0 f7c9d9f0 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02688fa>] lock_sock+0x6a/0xc0 [<c0268e09>] __kfree_skb+0x79/0x100 [<c02905fc>] wait_for_connect+0xec/0x110 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c01cf529>] __get_user_4+0x11/0x17 [<c0264955>] move_addr_to_user+0x25/0x90 [<c017dab0>] new_inode+0x10/0xc0 [<c0265f7c>] sys_accept+0xec/0x160 [<c028fd7b>] tcp_close+0x36b/0x720 [<c0266b05>] sys_socketcall+0xf5/0x2a0 [<c01073c9>] sysenter_past_esp+0x52/0x71 httpd D 00000000 0 13713 3100 19047 12743 (NOTLB) df9a5c70 00000082 00000000 00000000 00000000 00000000 e45ef280 00000000 00000000 00000000 00000000 00000000 df9a4000 00000000 00000246 00000000 ffffffff c180dbe0 000001f6 976392ca 000222af f714e9c0 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02688fa>] lock_sock+0x6a/0xc0 [<c0268e09>] __kfree_skb+0x79/0x100 [<c02905fc>] wait_for_connect+0xec/0x110 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c01cf529>] __get_user_4+0x11/0x17 [<c0264955>] move_addr_to_user+0x25/0x90 [<c017dab0>] new_inode+0x10/0xc0 [<c0265f7c>] sys_accept+0xec/0x160 [<c014e121>] unmap_page_range+0x31/0x60 [<c014e246>] unmap_vmas+0xf6/0x310 [<c01488ff>] __pagevec_lru_add_active+0x13f/0x1b0 [<c017a24c>] dput+0x1c/0x3a0 [<c0161d39>] __fput+0xb9/0x120 [<c0266b05>] sys_socketcall+0xf5/0x2a0 [<c0152ce4>] do_munmap+0x154/0x1b0 [<c01073c9>] sysenter_past_esp+0x52/0x71 httpd D 00000000 0 19047 3100 13713 (NOTLB) c5e5dc70 00000082 00000000 00000000 00000000 00000000 f7a9e040 00000000 00000000 00000000 00000000 00000000 c5e5c000 00000000 00000246 f740cd80 f740cda0 c1815be0 00000160 0cfd8041 000222b0 eff20840 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02688fa>] lock_sock+0x6a/0xc0 [<c0268e09>] __kfree_skb+0x79/0x100 [<c02905fc>] wait_for_connect+0xec/0x110 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c01cf529>] __get_user_4+0x11/0x17 [<c0264955>] move_addr_to_user+0x25/0x90 [<c017dab0>] new_inode+0x10/0xc0 [<c0265f7c>] sys_accept+0xec/0x160 [<c028fd7b>] tcp_close+0x36b/0x720 [<c0266b05>] sys_socketcall+0xf5/0x2a0 [<c01073c9>] sysenter_past_esp+0x52/0x71 sshd S C3331DA4 0 3238 3364 3247 3756 (NOTLB) c3331d7c 00000086 c3331e50 c3331da4 00000000 c3331e50 c1fa6dc0 00000000 000475c4 00000000 fac86840 00000000 00000000 c1064f80 00000001 c3331e50 c01b637e c1815be0 0004594a 64ab66fa 0001d39c f7aafa50 c3331da8 e06aaa38 Call Trace: [<c01b637e>] pathrelse+0x1e/0x30 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c013fced>] generic_file_write_nolock+0x5d/0x80 [<c02c090a>] unix_stream_data_wait+0xfa/0x180 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c02c1003>] unix_stream_recvmsg+0x673/0x710 [<c01aa31f>] reiserfs_file_write+0x7ff/0x810 [<c0265030>] sock_aio_read+0xb0/0xd0 [<c0160bcd>] do_sync_read+0x6d/0xb0 [<c01f556e>] release_dev+0x33e/0x7e0 [<c015fdec>] dentry_open+0x14c/0x220 [<c015fc8f>] filp_open+0x4f/0x60 [<c0160cf7>] vfs_read+0xe7/0x110 [<c0161d39>] __fput+0xb9/0x120 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sshd D 00000000 0 3247 3238 3248 (NOTLB) c309dd18 00000086 00000000 00000000 00000000 00000000 c1fa64c0 00000000 00000000 00000000 00000000 00000000 c309c000 00000000 00000246 00000000 ffffffff c1815be0 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c01fc19a>] pty_chars_in_buffer+0x1a/0x40 [<c01fc175>] pty_write_room+0x25/0x30 [<c0175a04>] poll_freewait+0x44/0x50 [<c0175dc7>] do_select+0x1e7/0x330 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c017007b>] get_write_access+0x4b/0xe0 [<c01cf8e0>] __copy_to_user_ll+0x40/0x60 [<c01762e7>] sys_select+0x3b7/0x4d0 [<c0160f78>] sys_write+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 bash S C030D340 0 3248 3247 (NOTLB) c757be58 00000086 00000010 c030d340 00000246 01470f60 f7a5d4c0 00000000 00000000 00000010 c1817708 00000000 f406c260 c180dbe0 0001e5df 4853c415 0001e873 c19633c0 c0129026 00000001 Call Trace: [<c0129026>] update_wall_time+0x16/0x40 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c01f81b4>] opost_block+0xf4/0x1b0 [<c01facaa>] read_chan+0x96a/0xb00 [<c014f435>] do_wp_page+0x4c5/0x570 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sshd S EDEF5DA4 0 3756 3364 3759 3914 3238 (NOTLB) edef5d7c 00000082 edef5e50 edef5da4 c010dfe0 c03c6cb0 f7a9cb80 e089c860 00000620 c192f240 c0268b62 d7c83812 d7c83812 d7c83812 00000000 00000246 f7fa3190 c1815be0 000024d4 56c3dc45 0001da75 f70aa9e0 f70ab980 f70aa810 Call Trace: [<c010dfe0>] do_gettimeofday+0x20/0xc0 [<c0268b62>] alloc_skb+0x32/0xd0 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c02c090a>] unix_stream_data_wait+0xfa/0x180 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0107365>] need_resched+0x27/0x32 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c02c1003>] unix_stream_recvmsg+0x673/0x710 [<c013dc64>] file_read_actor+0xc4/0xd0 [<c01aa31f>] reiserfs_file_write+0x7ff/0x810 [<c0265030>] sock_aio_read+0xb0/0xd0 [<c0160bcd>] do_sync_read+0x6d/0xb0 [<c01f556e>] release_dev+0x33e/0x7e0 [<c015fdec>] dentry_open+0x14c/0x220 [<c015fc8f>] filp_open+0x4f/0x60 [<c0160cf7>] vfs_read+0xe7/0x110 [<c0161d39>] __fput+0xb9/0x120 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sshd D 00000000 0 3759 3756 3760 (NOTLB) d49bfd18 00000086 00000000 00000000 00000000 00000000 c1fa6700 00000000 00000000 00000000 00000000 00000000 d49be000 00000000 00000246 00000000 ffffffff c1815be0 00000bad 1ca15353 000222b0 f7c9ce50 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c01fc19a>] pty_chars_in_buffer+0x1a/0x40 [<c01fc175>] pty_write_room+0x25/0x30 [<c0175a04>] poll_freewait+0x44/0x50 [<c0175dc7>] do_select+0x1e7/0x330 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c017007b>] get_write_access+0x4b/0xe0 [<c01cf8e0>] __copy_to_user_ll+0x40/0x60 [<c01762e7>] sys_select+0x3b7/0x4d0 [<c0160f78>] sys_write+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 bash S 00000246 0 3760 3759 (NOTLB) e7045e58 00000086 c030d300 00000246 01468f60 c0141c4f f7cd84c0 db8215b4 c030d680 c12446c0 00000000 00000000 00000082 c1914c00 d20a9000 e7045e94 e7045e6c c180dbe0 0008cdf5 57ea2b52 0001db26 f714fb30 c013ce05 c011a771 Call Trace: [<c0141c4f>] buffered_rmqueue+0x10f/0x280 [<c013ce05>] find_get_page+0x35/0xc0 [<c011a771>] __wake_up_common+0x31/0x60 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c01f81b4>] opost_block+0xf4/0x1b0 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sshd S D806FDA4 0 3914 3364 3917 3756 (NOTLB) d806fd7c 00000086 d806fe50 d806fda4 00000000 d806fe50 f7a6adc0 00000000 00047c9c 00000000 1ec86840 d806fd5c f7aae140 d5d0fb1e 02036e86 d19acde0 d19ace00 c180dbe0 0000201b 11d21020 000203d5 eff20e10 c180dbe0 d806fd98 Call Trace: [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0118397>] recalc_task_prio+0x97/0x1c0 [<c02c090a>] unix_stream_data_wait+0xfa/0x180 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4b1>] autoremove_wake_function+0x11/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0268b62>] alloc_skb+0x32/0xd0 [<c02c1003>] unix_stream_recvmsg+0x673/0x710 [<c0265030>] sock_aio_read+0xb0/0xd0 [<c0160bcd>] do_sync_read+0x6d/0xb0 [<c0129026>] update_wall_time+0x16/0x40 [<c0160cf7>] vfs_read+0xe7/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sshd D 00000000 0 3917 3914 3918 (NOTLB) f682dd18 00000000 00000000 00000000 00000000 f7a9c040 00000000 00000000 00000000 00000000 f682c000 00000000 00000246 00000000 ffffffff c180dbe0 000014af 982fad90 000222af f7aae310 00000000 c030dc20 Call Trace: [<c0129642>] schedule_timeout+0x72/0xd0 [<c01295c0>] process_timeout+0x0/0x10 [<c011bfa8>] io_schedule_timeout+0x28/0x40 [<c020e8ab>] blk_congestion_wait+0x7b/0x90 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c014205b>] __alloc_pages+0x29b/0x330 [<c01cf8f9>] __copy_to_user_ll+0x59/0x60 [<c015bc11>] read_swap_cache_async+0x101/0x10d [<c014f79f>] swapin_readahead+0x2f/0xd0 [<c014fb57>] do_swap_page+0x317/0x430 [<c014d835>] pte_alloc_map+0xc5/0x130 [<c01507f8>] handle_mm_fault+0xc8/0x1d0 [<c0117444>] do_page_fault+0x304/0x4ef [<c01fc19a>] pty_chars_in_buffer+0x1a/0x40 [<c01fc175>] pty_write_room+0x25/0x30 [<c0175a04>] poll_freewait+0x44/0x50 [<c0175dc7>] do_select+0x1e7/0x330 [<c0117140>] do_page_fault+0x0/0x4ef [<c0107e85>] error_code+0x2d/0x38 [<c017007b>] get_write_access+0x4b/0xe0 [<c01cf8e0>] __copy_to_user_ll+0x40/0x60 [<c01762e7>] sys_select+0x3b7/0x4d0 [<c0114246>] smp_apic_timer_interrupt+0xd6/0x140 [<c01073c9>] sysenter_past_esp+0x52/0x71 bash S 00000246 0 3918 3917 (NOTLB) ef5dde58 00000082 c030d780 00000246 c030d780 c0141c4f f7bd5040 db8215b4 c030db00 c17735c0 00000000 00000000 0000038e c1914c00 eff81000 d19ac810 d19ac830 c180dbe0 000cb843 74fcf82c 0001dc80 f7a40940 c013ce05 00000000 Call Trace: [<c0141c4f>] buffered_rmqueue+0x10f/0x280 [<c013ce05>] find_get_page+0x35/0xc0 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c01f81b4>] opost_block+0xf4/0x1b0 [<c01facaa>] read_chan+0x96a/0xb00 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c01f46dd>] tty_write+0x1ad/0x360 [<c01f44f6>] tty_read+0x176/0x1b0 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 pdflush S 00000000 0 3951 6 24 (L-TLB) c268df78 00000046 00000000 00000000 00000000 00000000 f7a4d940 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 f73d07d0 f73d07f0 c1815be0 00000110 29477494 000222b0 f7bb28a0 00000000 00000000 Call Trace: [<c0144105>] __pdflush+0xd5/0x380 [<c011a771>] __wake_up_common+0x31/0x60 [<c01443b0>] pdflush+0x0/0x10 [<c01443ba>] pdflush+0xa/0x10 [<c01443b0>] pdflush+0x0/0x10 [<c0135e94>] kthread+0xa4/0xb0 [<c0135df0>] kthread+0x0/0xb0 [<c0104ec5>] kernel_thread_helper+0x5/0x10 pdflush S 00000000 0 6583 7 (L-TLB) c4ae1f78 00000046 00000000 00000000 00000000 00000000 f7a9c040 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 c180dbe0 00000285 9884aecd 000222af eff20270 00000000 00000000 Call Trace: [<c0144105>] __pdflush+0xd5/0x380 [<c011a771>] __wake_up_common+0x31/0x60 [<c01443b0>] pdflush+0x0/0x10 [<c01443ba>] pdflush+0xa/0x10 [<c01443b0>] pdflush+0x0/0x10 [<c0135e94>] kthread+0xa4/0xb0 [<c0135df0>] kthread+0x0/0xb0 [<c0104ec5>] kernel_thread_helper+0x5/0x10 cron S C030D300 0 6677 3391 6678 6760 (NOTLB) f4233ed8 00000082 c0141e77 c030d300 00000010 00000001 e45efb80 d19ac240 f7b074c0 f7a85f0c c011a2c9 f4233f04 00000082 d19ac240 00000010 e7892280 e78922a0 c1815be0 0000eff8 924d6713 00020568 d19ac410 f7fffaa0 f7a62180 Call Trace: [<c0141e77>] __alloc_pages+0xb7/0x330 [<c011a2c9>] schedule+0x389/0x7a0 [<c016ee7c>] pipe_wait+0x7c/0xa0 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c01cf8f9>] __copy_to_user_ll+0x59/0x60 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c016f07a>] pipe_readv+0x1da/0x2c0 [<c016f180>] pipe_read+0x20/0x30 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sh S E0B63080 0 6678 6677 6679 (NOTLB) caa4df48 c01508a3 e78483bc e45ef940 2c5fa065 c011db30 f6f71380 e45ef940 e45ef960 f6f71380 d19ad3b0 c0117444 f73d0da0 f73d0dc0 c1815be0 0002185a 91df5d00 00020568 d19ad580 f1499544 00000001 Call Trace: [<c01508a3>] handle_mm_fault+0x173/0x1d0 [<c011db30>] copy_mm+0x250/0x570 [<c0117444>] do_page_fault+0x304/0x4ef [<c012354b>] sys_wait4+0x1bb/0x280 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c0123635>] sys_waitpid+0x25/0x29 [<c01073c9>] sysenter_past_esp+0x52/0x71 daily_reports S F0D7B080 0 6679 6678 6689 (NOTLB) e9afbf48 00000086 080f1e34 f0d7b080 c01508a3 edc523c4 f7cd8700 347c9065 c011db30 f6f71770 f7cd8700 f7cd8720 f6f71770 d19ac810 c0117444 00000001 aea87c72 c180dbe0 00016174 5e27e5ad 0002057f d19ac9e0 f712a584 00000000 Call Trace: [<c01508a3>] handle_mm_fault+0x173/0x1d0 [<c011db30>] copy_mm+0x250/0x570 [<c0117444>] do_page_fault+0x304/0x4ef [<c012354b>] sys_wait4+0x1bb/0x280 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c0123635>] sys_waitpid+0x25/0x29 [<c01073c9>] sysenter_past_esp+0x52/0x71 mysql S C015081C 0 6689 6679 (NOTLB) f5fe5d7c 00000086 d57ed400 c015081c 00000001 f4182998 f7d66dc0 c01bebe9 ccadf818 f7d66dc0 f7d66de0 ccadf818 f73d07d0 603d05d3 0002057f f73d07d0 f73d07f0 c1815be0 000021e1 605a5570 0002057f e78935c0 00000000 f5fe5df8 Call Trace: [<c015081c>] handle_mm_fault+0xec/0x1d0 [<c01bebe9>] journal_mark_dirty+0x159/0x2e0 [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0141e77>] __alloc_pages+0xb7/0x330 [<c011d4b1>] autoremove_wake_function+0x11/0x40 [<c02c090a>] unix_stream_data_wait+0xfa/0x180 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c0268b62>] alloc_skb+0x32/0xd0 [<c02c1003>] unix_stream_recvmsg+0x673/0x710 [<c0265030>] sock_aio_read+0xb0/0xd0 [<c0160bcd>] do_sync_read+0x6d/0xb0 [<c014e246>] unmap_vmas+0xf6/0x310 [<c01488ff>] __pagevec_lru_add_active+0x13f/0x1b0 [<c012eb45>] sys_rt_sigaction+0xd5/0xf0 [<c0160cf7>] vfs_read+0xe7/0x110 [<c017489b>] do_fcntl+0x11b/0x1d0 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 cron S C030D300 0 6760 3391 6761 6677 (NOTLB) eb401ed8 00000082 c0141e77 c030d300 00000010 00000001 f7a4d040 e7892280 f7a4d040 d4323c28 c011a2c9 eb401f04 00000082 e7892280 00000010 c1962c20 c1962c40 c1815be0 0000d409 4121cbab 0002062c e7892450 f7fffaa0 c1962c20 Call Trace: [<c0141e77>] __alloc_pages+0xb7/0x330 [<c011a2c9>] schedule+0x389/0x7a0 [<c016ee7c>] pipe_wait+0x7c/0xa0 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c01cf8f9>] __copy_to_user_ll+0x59/0x60 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c016f07a>] pipe_readv+0x1da/0x2c0 [<c016f180>] pipe_read+0x20/0x30 [<c0160cc0>] vfs_read+0xb0/0x110 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 sh S F7B97860 0 6761 6760 6763 (NOTLB) d4323f48 00000086 0002062c f7b97860 f7b97880 c1815be0 f7a4d4c0 4190165b 0002062c c1962df4 d4323f7c d4322000 d4322000 d4323f7c d4322000 e8af8160 e8af8180 c1815be0 000015ca 41908132 0002062c c1962df0 f7bf5fa4 00000001 Call Trace: [<c012354b>] sys_wait4+0x1bb/0x280 [<c011a730>] default_wake_function+0x0/0x10 [<c011a730>] default_wake_function+0x0/0x10 [<c0123635>] sys_waitpid+0x25/0x29 [<c01073c9>] sysenter_past_esp+0x52/0x71 php S D2527DA4 0 6763 6761 (NOTLB) d2527d7c 00000082 d2527e50 d2527da4 00000000 d2527e50 c1fa6280 00000000 00008c65 00000000 36d5b0c8 00000000 f70ab3b0 c10c0900 e5d5b060 f70ab3b0 f70ab3d0 c180dbe0 00002530 8a40e181 0002062c e8af8ed0 00000000 f7dacc00 Call Trace: [<c0129693>] schedule_timeout+0xc3/0xd0 [<c0118397>] recalc_task_prio+0x97/0x1c0 [<c02c090a>] unix_stream_data_wait+0xfa/0x180 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c011d4b1>] autoremove_wake_function+0x11/0x40 [<c011d4a0>] autoremove_wake_function+0x0/0x40 [<c02c1003>] unix_stream_recvmsg+0x673/0x710 [<c0265030>] sock_aio_read+0xb0/0xd0 [<c0160bcd>] do_sync_read+0x6d/0xb0 [<c0160cf7>] vfs_read+0xe7/0x110 [<c017489b>] do_fcntl+0x11b/0x1d0 [<c0160f18>] sys_read+0x38/0x60 [<c01073c9>] sysenter_past_esp+0x52/0x71 -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.X kernel memory leak? 2004-04-09 7:17 ` 2.6.X kernel memory leak? Sergey S. Kostyliov @ 2004-04-09 9:09 ` Andrew Morton 2004-04-09 12:15 ` Sergey S. Kostyliov 0 siblings, 1 reply; 24+ messages in thread From: Andrew Morton @ 2004-04-09 9:09 UTC (permalink / raw) To: Sergey S. Kostyliov; +Cc: linux-kernel, anton "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > And here is part of sysrq-T for the third machine, which have just locked up, > kernel is 2.6.5-rc3-aa2. It does look like a kernel memory leak, but it's not into slab. You've disabled iptables. Possibly there's a leak in a device driver? Which drivers are in regular use there? What are you using for those hardware RAID controllers? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: 2.6.X kernel memory leak? 2004-04-09 9:09 ` Andrew Morton @ 2004-04-09 12:15 ` Sergey S. Kostyliov 0 siblings, 0 replies; 24+ messages in thread From: Sergey S. Kostyliov @ 2004-04-09 12:15 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, anton Hello Andrew, On Friday 09 April 2004 13:09, Andrew Morton wrote: > "Sergey S. Kostyliov" <rathamahata@php4.ru> wrote: > > > > And here is part of sysrq-T for the third machine, which have just locked up, > > kernel is 2.6.5-rc3-aa2. > > It does look like a kernel memory leak, but it's not into slab. > > You've disabled iptables. Possibly there's a leak in a device driver? > Which drivers are in regular use there? What are you using for those > hardware RAID controllers? I've seen this kind of lockup (according to sysrq-T) on different boxes: 1) ope RAID: mylex 352 drivers: e100, dac960 .config: http://sysadminday.org.ru/2.6.1-io_lockup/ope/.config 2) terror RAID: megaraid 320-2 drivers: e1000, megaraid2 .config: http://sysadminday.org.ru/2.6.X-lockup/terror/.config 3) mirror drivers: e100, aic7xxx, md, netconsole .config: http://sysadminday.org.ru/2.6.X-lockup/mirror/.config I also saw the same symptoms on a fourth box, but I'm not shure about this one because it didn't use to be attached to serial console at that time. For this box: RAID: Compaq smart 2 drivers: tlan,epic100,cpqarray -- Best regards, Sergey S. Kostyliov <rathamahata@php4.ru> Public PGP key: http://sysadminday.org.ru/rathamahata.asc ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2004-04-09 12:19 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-01-31 16:40 2.6.1 IO lockup on SMP systems Sergey S. Kostyliov 2004-02-01 0:17 ` Andrew Morton 2004-02-21 16:45 ` Sergey S. Kostyliov 2004-02-21 19:30 ` Andrew Morton 2004-02-22 17:39 ` Alexander Y. Fomichev 2004-02-23 17:27 ` Sergey S. Kostyliov 2004-02-23 21:30 ` Mike Fedyk 2004-02-24 11:56 ` Sergey S. Kostyliov 2004-02-23 22:26 ` Andrew Morton 2004-02-24 7:23 ` Marcelo Tosatti 2004-02-24 6:53 ` Andrew Morton 2004-02-24 11:54 ` Sergey S. Kostyliov 2004-02-26 12:19 ` Sergey S. Kostyliov 2004-02-26 12:53 ` Andrew Morton 2004-02-26 13:11 ` Andrew Morton 2004-02-26 14:37 ` Dave Jones 2004-02-26 15:37 ` Arjan van de Ven 2004-02-26 14:30 ` Sergey S. Kostyliov 2004-02-26 20:03 ` Andrew Morton 2004-02-28 14:56 ` Sergey S. Kostyliov 2004-04-08 9:08 ` 2.6.X kernel memory leak? (was: Re: 2.6.1 IO lockup on SMP systems) Sergey S. Kostyliov 2004-04-09 7:17 ` 2.6.X kernel memory leak? Sergey S. Kostyliov 2004-04-09 9:09 ` Andrew Morton 2004-04-09 12:15 ` Sergey S. Kostyliov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox