* Re: Very aggressive memory reclaim [not found] <AANLkTinFqqmE+fTMTLVU-_CwPE+LQv7CpXSQ5+CdAKLK@mail.gmail.com> @ 2011-03-28 21:53 ` Dave Chinner 2011-03-28 22:52 ` Minchan Kim ` (2 more replies) [not found] ` <4D90C071.7040205@mnsu.edu> 1 sibling, 3 replies; 10+ messages in thread From: Dave Chinner @ 2011-03-28 21:53 UTC (permalink / raw) To: John Lepikhin; +Cc: linux-kernel, xfs, linux-mm [cc xfs and mm lists] On Mon, Mar 28, 2011 at 08:39:29PM +0400, John Lepikhin wrote: > Hello, > > I use high-loaded machine with 10M+ inodes inside XFS, 50+ GB of > memory, intensive HDD traffic and 20..50 forks per second. Vanilla > kernel 2.6.37.4. The problem is that kernel frees memory very > aggressively. > > For example: > > 25% of memory is used by processes > 50% for page caches > 7% for slabs, etc. > 18% free. > > That's bad but works. After few hours: > > 25% of memory is used by processes > 62% for page caches > 7% for slabs, etc. > 5% free. > > Most of files are cached, works perfectly. This is the moment when > kernel decides to free some memory. After memory reclaim: > > 25% of memory is used by processes > 25% for page caches(!) > 7% for slabs, etc. > 43% free(!) > > Page cache is dropped, server becomes too slow. This is the beginning > of new cycle. > > I didn't found any huge mallocs at that moment. Looks like because of > large number of small mallocs (forks) kernel have pessimistic forecast > about future memory usage and frees too much memory. Is there any > options of tuning this? Any other variants? First it would be useful to determine why the VM is reclaiming so much memory. If it is somewhat predictable when the excessive reclaim is going to happen, it might be worth capturing an event trace from the VM so we can see more precisely what it is doiing during this event. In that case, recording the kmem/* and vmscan/* events is probably sufficient to tell us what memory allocations triggered reclaim and how much reclaim was done on each event. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-28 21:53 ` Very aggressive memory reclaim Dave Chinner @ 2011-03-28 22:52 ` Minchan Kim 2011-03-29 2:55 ` KOSAKI Motohiro 2011-03-29 7:22 ` John Lepikhin 2011-03-28 23:58 ` Andi Kleen 2011-03-29 7:26 ` John Lepikhin 2 siblings, 2 replies; 10+ messages in thread From: Minchan Kim @ 2011-03-28 22:52 UTC (permalink / raw) To: Dave Chinner; +Cc: John Lepikhin, linux-kernel, xfs, linux-mm On Tue, Mar 29, 2011 at 6:53 AM, Dave Chinner <david@fromorbit.com> wrote: > [cc xfs and mm lists] > > On Mon, Mar 28, 2011 at 08:39:29PM +0400, John Lepikhin wrote: >> Hello, >> >> I use high-loaded machine with 10M+ inodes inside XFS, 50+ GB of >> memory, intensive HDD traffic and 20..50 forks per second. Vanilla >> kernel 2.6.37.4. The problem is that kernel frees memory very >> aggressively. >> >> For example: >> >> 25% of memory is used by processes >> 50% for page caches >> 7% for slabs, etc. >> 18% free. >> >> That's bad but works. After few hours: >> >> 25% of memory is used by processes >> 62% for page caches >> 7% for slabs, etc. >> 5% free. >> >> Most of files are cached, works perfectly. This is the moment when >> kernel decides to free some memory. After memory reclaim: >> >> 25% of memory is used by processes >> 25% for page caches(!) >> 7% for slabs, etc. >> 43% free(!) >> >> Page cache is dropped, server becomes too slow. This is the beginning >> of new cycle. >> >> I didn't found any huge mallocs at that moment. Looks like because of >> large number of small mallocs (forks) kernel have pessimistic forecast >> about future memory usage and frees too much memory. Is there any >> options of tuning this? Any other variants? > > First it would be useful to determine why the VM is reclaiming so > much memory. If it is somewhat predictable when the excessive > reclaim is going to happen, it might be worth capturing an event > trace from the VM so we can see more precisely what it is doiing > during this event. In that case, recording the kmem/* and vmscan/* > events is probably sufficient to tell us what memory allocations > triggered reclaim and how much reclaim was done on each event. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > Recently, We had a similar issue. http://www.spinics.net/lists/linux-mm/msg12243.html But it seems to not merge. I don't know why since I didn't follow up the thread. Maybe Cced guys can help you. Is it a sudden big cache drop at the moment or accumulated small cache drop for long time? What's your zones' size? Please attach the result of cat /proc/zoneinfo for others. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-28 22:52 ` Minchan Kim @ 2011-03-29 2:55 ` KOSAKI Motohiro 2011-03-29 7:33 ` John Lepikhin 2011-03-29 7:22 ` John Lepikhin 1 sibling, 1 reply; 10+ messages in thread From: KOSAKI Motohiro @ 2011-03-29 2:55 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Dave Chinner, John Lepikhin, linux-kernel, xfs, linux-mm > Recently, We had a similar issue. > http://www.spinics.net/lists/linux-mm/msg12243.html > But it seems to not merge. I don't know why since I didn't follow up the thread. > Maybe Cced guys can help you. > > Is it a sudden big cache drop at the moment or accumulated small cache > drop for long time? > What's your zones' size? > > Please attach the result of cat /proc/zoneinfo for others. If my remember is correct, 2.6.38 is included Mel's anti agressive reclaim patch. And original report seems to be using 2.6.37.x. John, can you try 2.6.38? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-29 2:55 ` KOSAKI Motohiro @ 2011-03-29 7:33 ` John Lepikhin 0 siblings, 0 replies; 10+ messages in thread From: John Lepikhin @ 2011-03-29 7:33 UTC (permalink / raw) To: KOSAKI Motohiro; +Cc: Minchan Kim, Dave Chinner, linux-kernel, xfs, linux-mm 2011/3/29 KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>: > If my remember is correct, 2.6.38 is included Mel's anti agressive > reclaim patch. And original report seems to be using 2.6.37.x. > > John, can you try 2.6.38? I'll ask my boss about it. Unfortunately we found opposite issue with memory management + XFS (100M of inodes) on 2.6.38: some objects in xfs_inode and dentry slabs are seems to be never cleared (at least without "sync && echo 2 >.../drop_caches"). But this is not a production machine working 24x7, so we don't care about it right now. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-28 22:52 ` Minchan Kim 2011-03-29 2:55 ` KOSAKI Motohiro @ 2011-03-29 7:22 ` John Lepikhin 1 sibling, 0 replies; 10+ messages in thread From: John Lepikhin @ 2011-03-29 7:22 UTC (permalink / raw) To: Minchan Kim; +Cc: Dave Chinner, linux-kernel, xfs, linux-mm [-- Attachment #1: Type: text/plain, Size: 204 bytes --] 2011/3/29 Minchan Kim <minchan.kim@gmail.com>: > Please attach the result of cat /proc/zoneinfo for others. See attachment. Right now I have no zoneinfo for crisis time, but I can catch it if required. [-- Attachment #2: zoneinfo --] [-- Type: application/octet-stream, Size: 14655 bytes --] Node 0, zone DMA pages free 3968 min 0 low 0 high 0 scanned 0 spanned 4080 present 3920 nr_free_pages 3968 nr_inactive_anon 0 nr_active_anon 0 nr_inactive_file 0 nr_active_file 0 nr_unevictable 0 nr_mlock 0 nr_anon_pages 0 nr_mapped 0 nr_file_pages 0 nr_dirty 0 nr_writeback 0 nr_slab_reclaimable 0 nr_slab_unreclaimable 0 nr_page_table_pages 0 nr_kernel_stack 0 nr_unstable 0 nr_bounce 0 nr_vmscan_write 0 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 0 nr_dirtied 0 nr_written 0 numa_hit 0 numa_miss 0 numa_foreign 0 numa_interleave 0 numa_local 0 numa_other 0 protection: (0, 2173, 48380, 48380) pagesets cpu: 0 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 1 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 2 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 3 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 4 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 5 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 6 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 7 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 8 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 9 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 10 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 11 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 12 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 13 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 14 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 15 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 16 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 17 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 18 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 19 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 20 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 21 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 22 count: 0 high: 0 batch: 1 vm stats threshold: 10 cpu: 23 count: 0 high: 0 batch: 1 vm stats threshold: 10 all_unreclaimable: 0 start_pfn: 16 inactive_ratio: 1 Node 0, zone DMA32 pages free 364761 min 16 low 20 high 24 scanned 0 spanned 1044480 present 556409 nr_free_pages 364761 nr_inactive_anon 21990 nr_active_anon 7663 nr_inactive_file 73 nr_active_file 2300 nr_unevictable 0 nr_mlock 0 nr_anon_pages 28933 nr_mapped 419 nr_file_pages 3093 nr_dirty 2 nr_writeback 0 nr_slab_reclaimable 114260 nr_slab_unreclaimable 17480 nr_page_table_pages 52 nr_kernel_stack 4 nr_unstable 0 nr_bounce 0 nr_vmscan_write 1394045 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 720 nr_dirtied 2710373 nr_written 3892495 numa_hit 457923559 numa_miss 142009831 numa_foreign 0 numa_interleave 0 numa_local 457586157 numa_other 142347233 protection: (0, 0, 46207, 46207) pagesets cpu: 0 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 1 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 2 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 3 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 4 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 5 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 6 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 7 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 8 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 9 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 10 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 11 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 12 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 13 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 14 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 15 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 16 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 17 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 18 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 19 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 20 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 21 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 22 count: 0 high: 186 batch: 31 vm stats threshold: 60 cpu: 23 count: 0 high: 186 batch: 31 vm stats threshold: 60 all_unreclaimable: 0 start_pfn: 4096 inactive_ratio: 4 Node 0, zone Normal pages free 1647180 min 357 low 446 high 535 scanned 0 spanned 11993088 present 11829120 nr_free_pages 1647180 nr_inactive_anon 450069 nr_active_anon 3433771 nr_inactive_file 2955991 nr_active_file 2119000 nr_unevictable 0 nr_mlock 0 nr_anon_pages 3638671 nr_mapped 115635 nr_file_pages 5320039 nr_dirty 75137 nr_writeback 2 nr_slab_reclaimable 786231 nr_slab_unreclaimable 97963 nr_page_table_pages 67845 nr_kernel_stack 1361 nr_unstable 0 nr_bounce 0 nr_vmscan_write 15379957 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 245041 nr_dirtied 211669545 nr_written 202526281 numa_hit 41186823085 numa_miss 235241135 numa_foreign 690066765 numa_interleave 27671 numa_local 41186809452 numa_other 235254768 protection: (0, 0, 0, 0) pagesets cpu: 0 count: 153 high: 186 batch: 31 vm stats threshold: 100 cpu: 1 count: 156 high: 186 batch: 31 vm stats threshold: 100 cpu: 2 count: 160 high: 186 batch: 31 vm stats threshold: 100 cpu: 3 count: 177 high: 186 batch: 31 vm stats threshold: 100 cpu: 4 count: 157 high: 186 batch: 31 vm stats threshold: 100 cpu: 5 count: 165 high: 186 batch: 31 vm stats threshold: 100 cpu: 6 count: 26 high: 186 batch: 31 vm stats threshold: 100 cpu: 7 count: 161 high: 186 batch: 31 vm stats threshold: 100 cpu: 8 count: 137 high: 186 batch: 31 vm stats threshold: 100 cpu: 9 count: 169 high: 186 batch: 31 vm stats threshold: 100 cpu: 10 count: 27 high: 186 batch: 31 vm stats threshold: 100 cpu: 11 count: 164 high: 186 batch: 31 vm stats threshold: 100 cpu: 12 count: 177 high: 186 batch: 31 vm stats threshold: 100 cpu: 13 count: 180 high: 186 batch: 31 vm stats threshold: 100 cpu: 14 count: 180 high: 186 batch: 31 vm stats threshold: 100 cpu: 15 count: 183 high: 186 batch: 31 vm stats threshold: 100 cpu: 16 count: 98 high: 186 batch: 31 vm stats threshold: 100 cpu: 17 count: 170 high: 186 batch: 31 vm stats threshold: 100 cpu: 18 count: 92 high: 186 batch: 31 vm stats threshold: 100 cpu: 19 count: 157 high: 186 batch: 31 vm stats threshold: 100 cpu: 20 count: 162 high: 186 batch: 31 vm stats threshold: 100 cpu: 21 count: 177 high: 186 batch: 31 vm stats threshold: 100 cpu: 22 count: 157 high: 186 batch: 31 vm stats threshold: 100 cpu: 23 count: 171 high: 186 batch: 31 vm stats threshold: 100 all_unreclaimable: 0 start_pfn: 1048576 inactive_ratio: 21 Node 1, zone Normal pages free 2790210 min 375 low 468 high 562 scanned 0 spanned 12582912 present 12410880 nr_free_pages 2790210 nr_inactive_anon 551309 nr_active_anon 3043861 nr_inactive_file 2311025 nr_active_file 2310059 nr_unevictable 0 nr_mlock 0 nr_anon_pages 3297378 nr_mapped 141699 nr_file_pages 4918896 nr_dirty 51559 nr_writeback 0 nr_slab_reclaimable 862638 nr_slab_unreclaimable 123830 nr_page_table_pages 145273 nr_kernel_stack 1497 nr_unstable 0 nr_bounce 0 nr_vmscan_write 10156465 nr_writeback_temp 0 nr_isolated_anon 0 nr_isolated_file 0 nr_shmem 297798 nr_dirtied 216473986 nr_written 191524308 numa_hit 43711879913 numa_miss 690066765 numa_foreign 377250966 numa_interleave 27810 numa_local 43711847977 numa_other 690098701 protection: (0, 0, 0, 0) pagesets cpu: 0 count: 159 high: 186 batch: 31 vm stats threshold: 100 cpu: 1 count: 71 high: 186 batch: 31 vm stats threshold: 100 cpu: 2 count: 175 high: 186 batch: 31 vm stats threshold: 100 cpu: 3 count: 180 high: 186 batch: 31 vm stats threshold: 100 cpu: 4 count: 181 high: 186 batch: 31 vm stats threshold: 100 cpu: 5 count: 74 high: 186 batch: 31 vm stats threshold: 100 cpu: 6 count: 170 high: 186 batch: 31 vm stats threshold: 100 cpu: 7 count: 159 high: 186 batch: 31 vm stats threshold: 100 cpu: 8 count: 176 high: 186 batch: 31 vm stats threshold: 100 cpu: 9 count: 161 high: 186 batch: 31 vm stats threshold: 100 cpu: 10 count: 180 high: 186 batch: 31 vm stats threshold: 100 cpu: 11 count: 5 high: 186 batch: 31 vm stats threshold: 100 cpu: 12 count: 184 high: 186 batch: 31 vm stats threshold: 100 cpu: 13 count: 122 high: 186 batch: 31 vm stats threshold: 100 cpu: 14 count: 168 high: 186 batch: 31 vm stats threshold: 100 cpu: 15 count: 123 high: 186 batch: 31 vm stats threshold: 100 cpu: 16 count: 155 high: 186 batch: 31 vm stats threshold: 100 cpu: 17 count: 37 high: 186 batch: 31 vm stats threshold: 100 cpu: 18 count: 177 high: 186 batch: 31 vm stats threshold: 100 cpu: 19 count: 44 high: 186 batch: 31 vm stats threshold: 100 cpu: 20 count: 185 high: 186 batch: 31 vm stats threshold: 100 cpu: 21 count: 28 high: 186 batch: 31 vm stats threshold: 100 cpu: 22 count: 172 high: 186 batch: 31 vm stats threshold: 100 cpu: 23 count: 190 high: 186 batch: 31 vm stats threshold: 100 all_unreclaimable: 0 start_pfn: 13041664 inactive_ratio: 21 [-- Attachment #3: meminfo --] [-- Type: application/octet-stream, Size: 1015 bytes --] MemTotal: 99149428 kB MemFree: 19224476 kB Buffers: 0 kB Cached: 40968112 kB SwapCached: 0 kB Active: 43666616 kB Inactive: 25161828 kB Active(anon): 25941180 kB Inactive(anon): 4093472 kB Active(file): 17725436 kB Inactive(file): 21068356 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 506792 kB Writeback: 0 kB AnonPages: 27859928 kB Mapped: 1031012 kB Shmem: 2174236 kB Slab: 8009608 kB SReclaimable: 7052516 kB SUnreclaim: 957092 kB KernelStack: 22896 kB PageTables: 852680 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 49574712 kB Committed_AS: 301362748 kB VmallocTotal: 34359738367 kB VmallocUsed: 635548 kB VmallocChunk: 34258959360 kB DirectMap4k: 1284 kB DirectMap2M: 3084288 kB DirectMap1G: 97517568 kB ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-28 21:53 ` Very aggressive memory reclaim Dave Chinner 2011-03-28 22:52 ` Minchan Kim @ 2011-03-28 23:58 ` Andi Kleen 2011-03-29 1:57 ` Dave Chinner 2011-03-29 7:26 ` John Lepikhin 2 siblings, 1 reply; 10+ messages in thread From: Andi Kleen @ 2011-03-28 23:58 UTC (permalink / raw) To: Dave Chinner; +Cc: John Lepikhin, linux-kernel, xfs, linux-mm Dave Chinner <david@fromorbit.com> writes: > > First it would be useful to determine why the VM is reclaiming so > much memory. If it is somewhat predictable when the excessive > reclaim is going to happen, it might be worth capturing an event Often it's to get pages of a higher order. Just tracing alloc_pages should tell you that. There are a few other cases (like memory failure handling), but they're more obscure. -Andi -- ak@linux.intel.com -- Speaking for myself only -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-28 23:58 ` Andi Kleen @ 2011-03-29 1:57 ` Dave Chinner 0 siblings, 0 replies; 10+ messages in thread From: Dave Chinner @ 2011-03-29 1:57 UTC (permalink / raw) To: Andi Kleen; +Cc: John Lepikhin, linux-kernel, xfs, linux-mm On Mon, Mar 28, 2011 at 04:58:50PM -0700, Andi Kleen wrote: > Dave Chinner <david@fromorbit.com> writes: > > > > First it would be useful to determine why the VM is reclaiming so > > much memory. If it is somewhat predictable when the excessive > > reclaim is going to happen, it might be worth capturing an event > > Often it's to get pages of a higher order. Just tracing alloc_pages > should tell you that. Yes, the kmem/mm_page_alloc tracepoint gives us that. But in case that is not the cause, grabbing all the trace points I suggested is more likely to indicate where the problem is. I'd prefer to get more data than needed the first time around than have to do multiple round trips because a single trace point doesn't tell us the cause... Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-28 21:53 ` Very aggressive memory reclaim Dave Chinner 2011-03-28 22:52 ` Minchan Kim 2011-03-28 23:58 ` Andi Kleen @ 2011-03-29 7:26 ` John Lepikhin 2011-03-29 8:59 ` Avi Kivity 2 siblings, 1 reply; 10+ messages in thread From: John Lepikhin @ 2011-03-29 7:26 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel, xfs, linux-mm 2011/3/29 Dave Chinner <david@fromorbit.com>: > First it would be useful to determine why the VM is reclaiming so > much memory. If it is somewhat predictable when the excessive > reclaim is going to happen, it might be worth capturing an event > trace from the VM so we can see more precisely what it is doiing > during this event. In that case, recording the kmem/* and vmscan/* > events is probably sufficient to tell us what memory allocations > triggered reclaim and how much reclaim was done on each event. Do you mean I must add some debug to mm functions? I don't know any other way to catch such events. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Very aggressive memory reclaim 2011-03-29 7:26 ` John Lepikhin @ 2011-03-29 8:59 ` Avi Kivity 0 siblings, 0 replies; 10+ messages in thread From: Avi Kivity @ 2011-03-29 8:59 UTC (permalink / raw) To: John Lepikhin; +Cc: Dave Chinner, linux-kernel, xfs, linux-mm On 03/29/2011 09:26 AM, John Lepikhin wrote: > 2011/3/29 Dave Chinner<david@fromorbit.com>: > > > First it would be useful to determine why the VM is reclaiming so > > much memory. If it is somewhat predictable when the excessive > > reclaim is going to happen, it might be worth capturing an event > > trace from the VM so we can see more precisely what it is doiing > > during this event. In that case, recording the kmem/* and vmscan/* > > events is probably sufficient to tell us what memory allocations > > triggered reclaim and how much reclaim was done on each event. > > Do you mean I must add some debug to mm functions? I don't know any > other way to catch such events. Download and build trace-cmd (git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git), and do $ trace-cmd record -e kmem -e vmscan -b 30000 Hit ctrl-C when done and post the output file generated in cwd. -- error compiling committee.c: too many arguments to function -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4D90C071.7040205@mnsu.edu>]
[parent not found: <AANLkTikmQJFq633VNqNOMC-BfEC=BU=g7j5uW78P4B4Z@mail.gmail.com>]
* Re: Very aggressive memory reclaim [not found] ` <AANLkTikmQJFq633VNqNOMC-BfEC=BU=g7j5uW78P4B4Z@mail.gmail.com> @ 2011-03-30 13:48 ` Wu Fengguang 0 siblings, 0 replies; 10+ messages in thread From: Wu Fengguang @ 2011-03-30 13:48 UTC (permalink / raw) To: John Lepikhin Cc: Jeffrey Hundstad, linux-kernel@vger.kernel.org. Alexander Viro, linux-fsdevel, Linux Memory Management List Hi John, On Mon, Mar 28, 2011 at 10:50:56PM +0400, John Lepikhin wrote: > 2011/3/28 Jeffrey Hundstad <jeffrey.hundstad@mnsu.edu>: > > > I'd take a look here: > > http://www.linuxinsight.com/proc_sys_vm_hierarchy.html > > Yes, I already played with dirty_*, min_free_kbytes (3000kb), > swappiness (0..100), vfs_cache_pressure (1..200) and zone_reclaim_mode > (currently 0). Other parameters are set to defaults. > > By the way, there is no swap enabled. Instead of just dropping 50% of > page caches, kernel was intensively swapping then there was a swap > device. Is your memory usage balanced across the nodes? You can check it via /sys/devices/system/node/node*/meminfo. Are there lots of high-order memory allocations? /proc/buddyinfo will disclose some of them. Thanks, Fengguang -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-30 13:48 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <AANLkTinFqqmE+fTMTLVU-_CwPE+LQv7CpXSQ5+CdAKLK@mail.gmail.com> 2011-03-28 21:53 ` Very aggressive memory reclaim Dave Chinner 2011-03-28 22:52 ` Minchan Kim 2011-03-29 2:55 ` KOSAKI Motohiro 2011-03-29 7:33 ` John Lepikhin 2011-03-29 7:22 ` John Lepikhin 2011-03-28 23:58 ` Andi Kleen 2011-03-29 1:57 ` Dave Chinner 2011-03-29 7:26 ` John Lepikhin 2011-03-29 8:59 ` Avi Kivity [not found] ` <4D90C071.7040205@mnsu.edu> [not found] ` <AANLkTikmQJFq633VNqNOMC-BfEC=BU=g7j5uW78P4B4Z@mail.gmail.com> 2011-03-30 13:48 ` Wu Fengguang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).