* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-11 18:54 ` Chris Friesen
0 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-11 18:54 UTC (permalink / raw)
To: Minchan Kim
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On 02/10/2010 06:45 PM, Minchan Kim wrote:
> On Thu, Feb 11, 2010 at 2:05 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>> In those spreadsheets I notice that
>> memfree+active+inactive+slab+pagetables is basically a constant.
>> However, if I don't use active+inactive then I can't make the numbers
>> add up. And the difference between active+inactive and
>> buffers+cached+anonpages+dirty+mapped+pagetables+vmallocused grows
>> almost monotonically.
>
> Such comparison is not right. That's because code pages of program account
> with cached and mapped but they account just one in lru list(active +
> inactive).
> Also, if you use mmap on any file, above is applied.
That just makes the comparison even worse...it means that there is more
memory in active/inactive that isn't accounted for in any other category
in /proc/meminfo.
> I can't find any clue with your attachment.
> You said you used kernel with some modification and non-vanilla drivers.
> So I suspect that. Maybe kernel memory leak?
Possibly. Or it could be a use case issue, I know there have been
memory leaks fixed since 2.6.27. :)
> Now kernel don't account kernel memory allocations except SLAB.
I don't think that's entirely accurate. I think cached, buffers,
pagetables, vmallocUsed are all kernel allocations. Granted, they're
generally on behalf of userspace.
I've discovered that the generic page allocator (alloc_page, etc.) is
not tracked at all in /proc/meminfo. I seem to see the memory increase
in the page cache (that is, active/inactive), so that would seem to rule
out most direct allocations.
> I think this patch can help you find the kernel memory leak.
> (It isn't merged with mainline by somewhy but it is useful to you :)
>
> http://marc.info/?l=linux-mm&m=123782029809850&w=2
I have a modified version of that which I picked up as part of the
kmemleak backport. However, it doesn't help unless I can narrow down
*which* pages I should care about.
I tried using kmemleak directly, but it didn't find anything. I've also
tried checking for inactive pages which haven't been written to in 10
minutes, and haven't had much luck there either. But active/inactive
keeps growing, and I don't know why.
Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-11 18:54 ` Chris Friesen
@ 2010-02-11 19:04 ` Rik van Riel
-1 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-11 19:04 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/11/2010 01:54 PM, Chris Friesen wrote:
> On 02/10/2010 06:45 PM, Minchan Kim wrote:
>> On Thu, Feb 11, 2010 at 2:05 AM, Chris Friesen<cfriesen@nortel.com> wrote:
>
>>> In those spreadsheets I notice that
>>> memfree+active+inactive+slab+pagetables is basically a constant.
>>> However, if I don't use active+inactive then I can't make the numbers
>>> add up. And the difference between active+inactive and
>>> buffers+cached+anonpages+dirty+mapped+pagetables+vmallocused grows
>>> almost monotonically.
>>
>> Such comparison is not right. That's because code pages of program account
>> with cached and mapped but they account just one in lru list(active +
>> inactive).
>> Also, if you use mmap on any file, above is applied.
>
> That just makes the comparison even worse...it means that there is more
> memory in active/inactive that isn't accounted for in any other category
> in /proc/meminfo.
Which does not happen in the standard 2.6.27 kernel.
Are you leaking memory in your driver?
>
>> I can't find any clue with your attachment.
>> You said you used kernel with some modification and non-vanilla drivers.
>> So I suspect that. Maybe kernel memory leak?
>
> Possibly. Or it could be a use case issue, I know there have been
> memory leaks fixed since 2.6.27. :)
>
>> Now kernel don't account kernel memory allocations except SLAB.
>
> I don't think that's entirely accurate. I think cached, buffers,
> pagetables, vmallocUsed are all kernel allocations. Granted, they're
> generally on behalf of userspace.
>
> I've discovered that the generic page allocator (alloc_page, etc.) is
> not tracked at all in /proc/meminfo. I seem to see the memory increase
> in the page cache (that is, active/inactive), so that would seem to rule
> out most direct allocations.
>
>> I think this patch can help you find the kernel memory leak.
>> (It isn't merged with mainline by somewhy but it is useful to you :)
>>
>> http://marc.info/?l=linux-mm&m=123782029809850&w=2
>
> I have a modified version of that which I picked up as part of the
> kmemleak backport. However, it doesn't help unless I can narrow down
> *which* pages I should care about.
>
> I tried using kmemleak directly, but it didn't find anything. I've also
> tried checking for inactive pages which haven't been written to in 10
> minutes, and haven't had much luck there either. But active/inactive
> keeps growing, and I don't know why.
>
> Chris
--
All rights reversed.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-11 19:04 ` Rik van Riel
0 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-11 19:04 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/11/2010 01:54 PM, Chris Friesen wrote:
> On 02/10/2010 06:45 PM, Minchan Kim wrote:
>> On Thu, Feb 11, 2010 at 2:05 AM, Chris Friesen<cfriesen@nortel.com> wrote:
>
>>> In those spreadsheets I notice that
>>> memfree+active+inactive+slab+pagetables is basically a constant.
>>> However, if I don't use active+inactive then I can't make the numbers
>>> add up. And the difference between active+inactive and
>>> buffers+cached+anonpages+dirty+mapped+pagetables+vmallocused grows
>>> almost monotonically.
>>
>> Such comparison is not right. That's because code pages of program account
>> with cached and mapped but they account just one in lru list(active +
>> inactive).
>> Also, if you use mmap on any file, above is applied.
>
> That just makes the comparison even worse...it means that there is more
> memory in active/inactive that isn't accounted for in any other category
> in /proc/meminfo.
Which does not happen in the standard 2.6.27 kernel.
Are you leaking memory in your driver?
>
>> I can't find any clue with your attachment.
>> You said you used kernel with some modification and non-vanilla drivers.
>> So I suspect that. Maybe kernel memory leak?
>
> Possibly. Or it could be a use case issue, I know there have been
> memory leaks fixed since 2.6.27. :)
>
>> Now kernel don't account kernel memory allocations except SLAB.
>
> I don't think that's entirely accurate. I think cached, buffers,
> pagetables, vmallocUsed are all kernel allocations. Granted, they're
> generally on behalf of userspace.
>
> I've discovered that the generic page allocator (alloc_page, etc.) is
> not tracked at all in /proc/meminfo. I seem to see the memory increase
> in the page cache (that is, active/inactive), so that would seem to rule
> out most direct allocations.
>
>> I think this patch can help you find the kernel memory leak.
>> (It isn't merged with mainline by somewhy but it is useful to you :)
>>
>> http://marc.info/?l=linux-mm&m=123782029809850&w=2
>
> I have a modified version of that which I picked up as part of the
> kmemleak backport. However, it doesn't help unless I can narrow down
> *which* pages I should care about.
>
> I tried using kmemleak directly, but it didn't find anything. I've also
> tried checking for inactive pages which haven't been written to in 10
> minutes, and haven't had much luck there either. But active/inactive
> keeps growing, and I don't know why.
>
> Chris
--
All rights reversed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-11 18:54 ` Chris Friesen
@ 2010-02-12 2:38 ` Minchan Kim
-1 siblings, 0 replies; 41+ messages in thread
From: Minchan Kim @ 2010-02-12 2:38 UTC (permalink / raw)
To: Chris Friesen
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On Fri, Feb 12, 2010 at 3:54 AM, Chris Friesen <cfriesen@nortel.com> wrote:
> That just makes the comparison even worse...it means that there is more
> memory in active/inactive that isn't accounted for in any other category
> in /proc/meminfo.
Hmm. It's very strange. It's impossible if your kernel and drivers is normal.
Could you grep sources who increases NR_ACTIVE/INACTIVE?
I doubt one of your driver does increase and miss decrease.
>> Now kernel don't account kernel memory allocations except SLAB.
>
> I don't think that's entirely accurate. I think cached, buffers,
> pagetables, vmallocUsed are all kernel allocations. Granted, they're
> generally on behalf of userspace.
Yes. I just said simple. What I means kernel doesn't account whole memory
usage. :)
> I have a modified version of that which I picked up as part of the
> kmemleak backport. However, it doesn't help unless I can narrow down
> *which* pages I should care about.
kmemleak doesn't support page allocator and ioremap.
Above URL patch just can tell who requests page which is using(ie, not
free) now.
> I tried using kmemleak directly, but it didn't find anything. I've also
> tried checking for inactive pages which haven't been written to in 10
> minutes, and haven't had much luck there either. But active/inactive
> keeps growing, and I don't know why.
If leak cause by alloc_page or __get_free_pages, kmemleak can't find leak.
>
> Chris
>
--
Kind regards,
Minchan Kim
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-12 2:38 ` Minchan Kim
0 siblings, 0 replies; 41+ messages in thread
From: Minchan Kim @ 2010-02-12 2:38 UTC (permalink / raw)
To: Chris Friesen
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On Fri, Feb 12, 2010 at 3:54 AM, Chris Friesen <cfriesen@nortel.com> wrote:
> That just makes the comparison even worse...it means that there is more
> memory in active/inactive that isn't accounted for in any other category
> in /proc/meminfo.
Hmm. It's very strange. It's impossible if your kernel and drivers is normal.
Could you grep sources who increases NR_ACTIVE/INACTIVE?
I doubt one of your driver does increase and miss decrease.
>> Now kernel don't account kernel memory allocations except SLAB.
>
> I don't think that's entirely accurate. I think cached, buffers,
> pagetables, vmallocUsed are all kernel allocations. Granted, they're
> generally on behalf of userspace.
Yes. I just said simple. What I means kernel doesn't account whole memory
usage. :)
> I have a modified version of that which I picked up as part of the
> kmemleak backport. However, it doesn't help unless I can narrow down
> *which* pages I should care about.
kmemleak doesn't support page allocator and ioremap.
Above URL patch just can tell who requests page which is using(ie, not
free) now.
> I tried using kmemleak directly, but it didn't find anything. I've also
> tried checking for inactive pages which haven't been written to in 10
> minutes, and haven't had much luck there either. But active/inactive
> keeps growing, and I don't know why.
If leak cause by alloc_page or __get_free_pages, kmemleak can't find leak.
>
> Chris
>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-12 2:38 ` Minchan Kim
@ 2010-02-12 7:35 ` Chris Friesen
-1 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-12 7:35 UTC (permalink / raw)
To: Minchan Kim
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On 02/11/2010 08:38 PM, Minchan Kim wrote:
> On Fri, Feb 12, 2010 at 3:54 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>> That just makes the comparison even worse...it means that there is more
>> memory in active/inactive that isn't accounted for in any other category
>> in /proc/meminfo.
>
> Hmm. It's very strange. It's impossible if your kernel and drivers is normal.
> Could you grep sources who increases NR_ACTIVE/INACTIVE?
> I doubt one of your driver does increase and miss decrease.
I instrumented the page cache to track all additions/subtractions of
pages to/from the LRU. I also added some page flags to track pages
counting towards NR_FILE_PAGES and NR_ANON_PAGES. I then periodically
scanned all of the pages on the LRU and if they weren't part of
NR_FILE_PAGES or NR_ANON_PAGES I dumped the call chain of the code that
added the page to the LRU.
After being up about 2.5 hrs, there were 4265 pages in the LRU that
weren't part of file or anon. These broke down into two separate call
chains (there were actually three separate offsets within
compat_do_execve, but the rest was identical):
backtrace:
[<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
[<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
[<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
[<ffffffff8027d182>] do_wp_page+0x355/0x6f6
[<ffffffff8027eef1>] handle_mm_fault+0x62b/0x77c
[<ffffffff80632557>] do_page_fault+0x3c7/0xba0
[<ffffffff8062fb79>] error_exit+0x0/0x51
[<ffffffffffffffff>] 0xffffffffffffffff
and
backtrace:
[<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
[<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
[<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
[<ffffffff8027eddc>] handle_mm_fault+0x516/0x77c
[<ffffffff8027f180>] get_user_pages+0x13e/0x462
[<ffffffff802a2f65>] get_arg_page+0x6a/0xca
[<ffffffff802a30bf>] copy_strings+0xfa/0x1d4
[<ffffffff802a31c7>] copy_strings_kernel+0x2e/0x43
[<ffffffff802d33fb>] compat_do_execve+0x1fa/0x2fd
[<ffffffff8021e405>] sys32_execve+0x44/0x62
[<ffffffff8021def5>] ia32_ptregs_common+0x25/0x50
[<ffffffffffffffff>] 0xffffffffffffffff
I'll dig into them further, but do either of these look like known issues?
Chris
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-12 7:35 ` Chris Friesen
0 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-12 7:35 UTC (permalink / raw)
To: Minchan Kim
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On 02/11/2010 08:38 PM, Minchan Kim wrote:
> On Fri, Feb 12, 2010 at 3:54 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>> That just makes the comparison even worse...it means that there is more
>> memory in active/inactive that isn't accounted for in any other category
>> in /proc/meminfo.
>
> Hmm. It's very strange. It's impossible if your kernel and drivers is normal.
> Could you grep sources who increases NR_ACTIVE/INACTIVE?
> I doubt one of your driver does increase and miss decrease.
I instrumented the page cache to track all additions/subtractions of
pages to/from the LRU. I also added some page flags to track pages
counting towards NR_FILE_PAGES and NR_ANON_PAGES. I then periodically
scanned all of the pages on the LRU and if they weren't part of
NR_FILE_PAGES or NR_ANON_PAGES I dumped the call chain of the code that
added the page to the LRU.
After being up about 2.5 hrs, there were 4265 pages in the LRU that
weren't part of file or anon. These broke down into two separate call
chains (there were actually three separate offsets within
compat_do_execve, but the rest was identical):
backtrace:
[<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
[<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
[<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
[<ffffffff8027d182>] do_wp_page+0x355/0x6f6
[<ffffffff8027eef1>] handle_mm_fault+0x62b/0x77c
[<ffffffff80632557>] do_page_fault+0x3c7/0xba0
[<ffffffff8062fb79>] error_exit+0x0/0x51
[<ffffffffffffffff>] 0xffffffffffffffff
and
backtrace:
[<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
[<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
[<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
[<ffffffff8027eddc>] handle_mm_fault+0x516/0x77c
[<ffffffff8027f180>] get_user_pages+0x13e/0x462
[<ffffffff802a2f65>] get_arg_page+0x6a/0xca
[<ffffffff802a30bf>] copy_strings+0xfa/0x1d4
[<ffffffff802a31c7>] copy_strings_kernel+0x2e/0x43
[<ffffffff802d33fb>] compat_do_execve+0x1fa/0x2fd
[<ffffffff8021e405>] sys32_execve+0x44/0x62
[<ffffffff8021def5>] ia32_ptregs_common+0x25/0x50
[<ffffffffffffffff>] 0xffffffffffffffff
I'll dig into them further, but do either of these look like known issues?
Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-12 7:35 ` Chris Friesen
@ 2010-02-12 8:04 ` KOSAKI Motohiro
-1 siblings, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-02-12 8:04 UTC (permalink / raw)
To: Chris Friesen
Cc: kosaki.motohiro, Minchan Kim, Rik van Riel,
Linux Kernel Mailing List, linux-mm, Balbir Singh
> backtrace:
> [<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
> [<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
> [<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
> [<ffffffff8027d182>] do_wp_page+0x355/0x6f6
> [<ffffffff8027eef1>] handle_mm_fault+0x62b/0x77c
> [<ffffffff80632557>] do_page_fault+0x3c7/0xba0
> [<ffffffff8062fb79>] error_exit+0x0/0x51
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> and
>
> backtrace:
> [<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
> [<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
> [<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
> [<ffffffff8027eddc>] handle_mm_fault+0x516/0x77c
> [<ffffffff8027f180>] get_user_pages+0x13e/0x462
> [<ffffffff802a2f65>] get_arg_page+0x6a/0xca
> [<ffffffff802a30bf>] copy_strings+0xfa/0x1d4
> [<ffffffff802a31c7>] copy_strings_kernel+0x2e/0x43
> [<ffffffff802d33fb>] compat_do_execve+0x1fa/0x2fd
> [<ffffffff8021e405>] sys32_execve+0x44/0x62
> [<ffffffff8021def5>] ia32_ptregs_common+0x25/0x50
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> I'll dig into them further, but do either of these look like known issues?
no known issue.
AFAIK, 2.6.27 - 2.6.33 don't have such problem.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-12 8:04 ` KOSAKI Motohiro
0 siblings, 0 replies; 41+ messages in thread
From: KOSAKI Motohiro @ 2010-02-12 8:04 UTC (permalink / raw)
To: Chris Friesen
Cc: kosaki.motohiro, Minchan Kim, Rik van Riel,
Linux Kernel Mailing List, linux-mm, Balbir Singh
> backtrace:
> [<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
> [<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
> [<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
> [<ffffffff8027d182>] do_wp_page+0x355/0x6f6
> [<ffffffff8027eef1>] handle_mm_fault+0x62b/0x77c
> [<ffffffff80632557>] do_page_fault+0x3c7/0xba0
> [<ffffffff8062fb79>] error_exit+0x0/0x51
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> and
>
> backtrace:
> [<ffffffff8061c162>] kmemleak_alloc_page+0x1eb/0x380
> [<ffffffff80276ae8>] __pagevec_lru_add_active+0xb6/0x104
> [<ffffffff80276b85>] lru_cache_add_active+0x4f/0x53
> [<ffffffff8027eddc>] handle_mm_fault+0x516/0x77c
> [<ffffffff8027f180>] get_user_pages+0x13e/0x462
> [<ffffffff802a2f65>] get_arg_page+0x6a/0xca
> [<ffffffff802a30bf>] copy_strings+0xfa/0x1d4
> [<ffffffff802a31c7>] copy_strings_kernel+0x2e/0x43
> [<ffffffff802d33fb>] compat_do_execve+0x1fa/0x2fd
> [<ffffffff8021e405>] sys32_execve+0x44/0x62
> [<ffffffff8021def5>] ia32_ptregs_common+0x25/0x50
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> I'll dig into them further, but do either of these look like known issues?
no known issue.
AFAIK, 2.6.27 - 2.6.33 don't have such problem.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-12 7:35 ` Chris Friesen
@ 2010-02-15 15:50 ` Chris Friesen
-1 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-15 15:50 UTC (permalink / raw)
To: Minchan Kim
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On 02/12/2010 01:35 AM, Chris Friesen wrote:
> After being up about 2.5 hrs, there were 4265 pages in the LRU that
> weren't part of file or anon. These broke down into two separate call
> chains (there were actually three separate offsets within
> compat_do_execve, but the rest was identical):
I added some further instrumentation to track timestamps of when they
were added to the LRU, and when they were added/removed from
NR_ANON_PAGES. Based on this, it appears that the pages are being
removed from NR_ANON_PAGES but are still left in the LRU.
It looks like I have three general paths leading to the removal of the
pages from NR_ANON_PAGES:
del from anon list backtrace:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027dc7e>] unmap_vmas+0x41c/0x948
[<ffffffff80282405>] exit_mmap+0x7b/0x108
[<ffffffff8022f441>] mmput+0x33/0x110
[<ffffffff80233b05>] exit_mm+0x103/0x130
[<ffffffff802355b5>] do_exit+0x17b/0x91f
[<ffffffff80235d95>] do_group_exit+0x3c/0x9c
[<ffffffff80235e07>] sys_exit+0x0/0x12
[<ffffffff8021ddb5>] ia32_syscall_done+0x0/0xa
[<ffffffffffffffff>] 0xffffffffffffffff
del from anon list backtrace:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027dc7e>] unmap_vmas+0x41c/0x948
[<ffffffff80282405>] exit_mmap+0x7b/0x108
[<ffffffff8022f441>] mmput+0x33/0x110
[<ffffffff802a3a4e>] flush_old_exec+0x1d6/0x86a
[<ffffffff802dc007>] load_elf_binary+0x366/0x1d1f
[<ffffffff802a35c6>] search_binary_handler+0xa4/0x25a
[<ffffffff802d36dc>] compat_do_execve+0x2ab/0x2fd
[<ffffffff8021e435>] sys32_execve+0x44/0x62
[<ffffffff8021df25>] ia32_ptregs_common+0x25/0x50
[<ffffffffffffffff>] 0xffffffffffffffff
del from anon list backtrace:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027d1d7>] do_wp_page+0x37a/0x6f6
[<ffffffff8027ef21>] handle_mm_fault+0x62b/0x77c
[<ffffffff80632787>] do_page_fault+0x3c7/0xba0
[<ffffffff8062fda9>] error_exit+0x0/0x51
Looking at the code, it looks like page_remove_rmap() clears the
Anonpage flag and removes it from NR_ANON_PAGES, and the caller is
responsible for removing it from the LRU. Is that right?
I'll keep digging in the code, but does anyone know where the removal
from the LRU is supposed to happen in the above code paths?
Thanks,
Chris
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-15 15:50 ` Chris Friesen
0 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-15 15:50 UTC (permalink / raw)
To: Minchan Kim
Cc: KOSAKI Motohiro, Rik van Riel, Linux Kernel Mailing List,
linux-mm, Balbir Singh
On 02/12/2010 01:35 AM, Chris Friesen wrote:
> After being up about 2.5 hrs, there were 4265 pages in the LRU that
> weren't part of file or anon. These broke down into two separate call
> chains (there were actually three separate offsets within
> compat_do_execve, but the rest was identical):
I added some further instrumentation to track timestamps of when they
were added to the LRU, and when they were added/removed from
NR_ANON_PAGES. Based on this, it appears that the pages are being
removed from NR_ANON_PAGES but are still left in the LRU.
It looks like I have three general paths leading to the removal of the
pages from NR_ANON_PAGES:
del from anon list backtrace:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027dc7e>] unmap_vmas+0x41c/0x948
[<ffffffff80282405>] exit_mmap+0x7b/0x108
[<ffffffff8022f441>] mmput+0x33/0x110
[<ffffffff80233b05>] exit_mm+0x103/0x130
[<ffffffff802355b5>] do_exit+0x17b/0x91f
[<ffffffff80235d95>] do_group_exit+0x3c/0x9c
[<ffffffff80235e07>] sys_exit+0x0/0x12
[<ffffffff8021ddb5>] ia32_syscall_done+0x0/0xa
[<ffffffffffffffff>] 0xffffffffffffffff
del from anon list backtrace:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027dc7e>] unmap_vmas+0x41c/0x948
[<ffffffff80282405>] exit_mmap+0x7b/0x108
[<ffffffff8022f441>] mmput+0x33/0x110
[<ffffffff802a3a4e>] flush_old_exec+0x1d6/0x86a
[<ffffffff802dc007>] load_elf_binary+0x366/0x1d1f
[<ffffffff802a35c6>] search_binary_handler+0xa4/0x25a
[<ffffffff802d36dc>] compat_do_execve+0x2ab/0x2fd
[<ffffffff8021e435>] sys32_execve+0x44/0x62
[<ffffffff8021df25>] ia32_ptregs_common+0x25/0x50
[<ffffffffffffffff>] 0xffffffffffffffff
del from anon list backtrace:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027d1d7>] do_wp_page+0x37a/0x6f6
[<ffffffff8027ef21>] handle_mm_fault+0x62b/0x77c
[<ffffffff80632787>] do_page_fault+0x3c7/0xba0
[<ffffffff8062fda9>] error_exit+0x0/0x51
Looking at the code, it looks like page_remove_rmap() clears the
Anonpage flag and removes it from NR_ANON_PAGES, and the caller is
responsible for removing it from the LRU. Is that right?
I'll keep digging in the code, but does anyone know where the removal
from the LRU is supposed to happen in the above code paths?
Thanks,
Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-15 15:50 ` Chris Friesen
@ 2010-02-15 17:00 ` Rik van Riel
-1 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-15 17:00 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/15/2010 10:50 AM, Chris Friesen wrote:
> Looking at the code, it looks like page_remove_rmap() clears the
> Anonpage flag and removes it from NR_ANON_PAGES, and the caller is
> responsible for removing it from the LRU. Is that right?
Nope.
> I'll keep digging in the code, but does anyone know where the removal
> from the LRU is supposed to happen in the above code paths?
Removal from the LRU is done from the page freeing code, on
the final free of the page.
It appears you have code somewhere that increments the reference
count on user pages and then forgets to lower it afterwards.
--
All rights reversed.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-15 17:00 ` Rik van Riel
0 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-15 17:00 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/15/2010 10:50 AM, Chris Friesen wrote:
> Looking at the code, it looks like page_remove_rmap() clears the
> Anonpage flag and removes it from NR_ANON_PAGES, and the caller is
> responsible for removing it from the LRU. Is that right?
Nope.
> I'll keep digging in the code, but does anyone know where the removal
> from the LRU is supposed to happen in the above code paths?
Removal from the LRU is done from the page freeing code, on
the final free of the page.
It appears you have code somewhere that increments the reference
count on user pages and then forgets to lower it afterwards.
--
All rights reversed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-15 17:00 ` Rik van Riel
@ 2010-02-16 16:52 ` Chris Friesen
-1 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-16 16:52 UTC (permalink / raw)
To: Rik van Riel
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/15/2010 11:00 AM, Rik van Riel wrote:
> On 02/15/2010 10:50 AM, Chris Friesen wrote:
>
>> Looking at the code, it looks like page_remove_rmap() clears the
>> Anonpage flag and removes it from NR_ANON_PAGES, and the caller is
>> responsible for removing it from the LRU. Is that right?
>
> Nope.
>
>> I'll keep digging in the code, but does anyone know where the removal
>> from the LRU is supposed to happen in the above code paths?
>
> Removal from the LRU is done from the page freeing code, on
> the final free of the page.
>
> It appears you have code somewhere that increments the reference
> count on user pages and then forgets to lower it afterwards.
Okay, that makes sense.
I'm still trying to get a handle on the LRU removal though. The code
path that I saw most which resulted in clearing the anon bit but leaving
the page on the LRU was the following:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027dc7e>] unmap_vmas+0x41c/0x948
[<ffffffff80282405>] exit_mmap+0x7b/0x108
[<ffffffff8022f441>] mmput+0x33/0x110
[<ffffffff80233b05>] exit_mm+0x103/0x130
[<ffffffff802355b5>] do_exit+0x17b/0x91f
[<ffffffff80235d95>] do_group_exit+0x3c/0x9c
[<ffffffff80235e07>] sys_exit+0x0/0x12
[<ffffffff8021ddb5>] ia32_syscall_done+0x0/0xa
There are a bunch of inline functions involved, but I think the chain
from page_remove_rmap() back up to unmap_vmas() looks like this:
page_remove_rmap
zap_pte_range
zap_pmd_range
zap_pud_range
unmap_page_range
unmap_vmas
So in this scenario, where do the pages actually get removed from the
LRU list (assuming that they're not in use by anyone else)?
Thanks,
Chris
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-16 16:52 ` Chris Friesen
0 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-16 16:52 UTC (permalink / raw)
To: Rik van Riel
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/15/2010 11:00 AM, Rik van Riel wrote:
> On 02/15/2010 10:50 AM, Chris Friesen wrote:
>
>> Looking at the code, it looks like page_remove_rmap() clears the
>> Anonpage flag and removes it from NR_ANON_PAGES, and the caller is
>> responsible for removing it from the LRU. Is that right?
>
> Nope.
>
>> I'll keep digging in the code, but does anyone know where the removal
>> from the LRU is supposed to happen in the above code paths?
>
> Removal from the LRU is done from the page freeing code, on
> the final free of the page.
>
> It appears you have code somewhere that increments the reference
> count on user pages and then forgets to lower it afterwards.
Okay, that makes sense.
I'm still trying to get a handle on the LRU removal though. The code
path that I saw most which resulted in clearing the anon bit but leaving
the page on the LRU was the following:
[<ffffffff8029c951>] kmemleak_clear_anon+0x7f/0xbe
[<ffffffff802864c7>] page_remove_rmap+0x45/0x146
[<ffffffff8027dc7e>] unmap_vmas+0x41c/0x948
[<ffffffff80282405>] exit_mmap+0x7b/0x108
[<ffffffff8022f441>] mmput+0x33/0x110
[<ffffffff80233b05>] exit_mm+0x103/0x130
[<ffffffff802355b5>] do_exit+0x17b/0x91f
[<ffffffff80235d95>] do_group_exit+0x3c/0x9c
[<ffffffff80235e07>] sys_exit+0x0/0x12
[<ffffffff8021ddb5>] ia32_syscall_done+0x0/0xa
There are a bunch of inline functions involved, but I think the chain
from page_remove_rmap() back up to unmap_vmas() looks like this:
page_remove_rmap
zap_pte_range
zap_pmd_range
zap_pud_range
unmap_page_range
unmap_vmas
So in this scenario, where do the pages actually get removed from the
LRU list (assuming that they're not in use by anyone else)?
Thanks,
Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-16 16:52 ` Chris Friesen
@ 2010-02-16 17:12 ` Rik van Riel
-1 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-16 17:12 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 11:52 AM, Chris Friesen wrote:
> On 02/15/2010 11:00 AM, Rik van Riel wrote:
>> Removal from the LRU is done from the page freeing code, on
>> the final free of the page.
> There are a bunch of inline functions involved, but I think the chain
> from page_remove_rmap() back up to unmap_vmas() looks like this:
>
> page_remove_rmap
> zap_pte_range
> zap_pmd_range
> zap_pud_range
> unmap_page_range
> unmap_vmas
>
> So in this scenario, where do the pages actually get removed from the
> LRU list (assuming that they're not in use by anyone else)?
__page_cache_release
--
All rights reversed.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-16 17:12 ` Rik van Riel
0 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-16 17:12 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 11:52 AM, Chris Friesen wrote:
> On 02/15/2010 11:00 AM, Rik van Riel wrote:
>> Removal from the LRU is done from the page freeing code, on
>> the final free of the page.
> There are a bunch of inline functions involved, but I think the chain
> from page_remove_rmap() back up to unmap_vmas() looks like this:
>
> page_remove_rmap
> zap_pte_range
> zap_pmd_range
> zap_pud_range
> unmap_page_range
> unmap_vmas
>
> So in this scenario, where do the pages actually get removed from the
> LRU list (assuming that they're not in use by anyone else)?
__page_cache_release
--
All rights reversed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-16 17:12 ` Rik van Riel
@ 2010-02-16 21:26 ` Chris Friesen
-1 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-16 21:26 UTC (permalink / raw)
To: Rik van Riel
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 11:12 AM, Rik van Riel wrote:
> On 02/16/2010 11:52 AM, Chris Friesen wrote:
>> On 02/15/2010 11:00 AM, Rik van Riel wrote:
>
>>> Removal from the LRU is done from the page freeing code, on
>>> the final free of the page.
>
>> There are a bunch of inline functions involved, but I think the chain
>> from page_remove_rmap() back up to unmap_vmas() looks like this:
>>
>> page_remove_rmap
>> zap_pte_range
>> zap_pmd_range
>> zap_pud_range
>> unmap_page_range
>> unmap_vmas
>>
>> So in this scenario, where do the pages actually get removed from the
>> LRU list (assuming that they're not in use by anyone else)?
>
> __page_cache_release
For the backtrace scenario I posted it seems like it might actually be
release_pages(). There seems to be a plausible call chain:
__ClearPageLRU
release_pages
free_pages_and_swap_cache
tlb_flush_mmu
tlb_remove_page
zap_pte_range
Does that seem right? In this case, tlb_remove_page() is called right
after page_remove_rmap() which ultimately results in clearing the
PageAnon bit.
Chris
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-16 21:26 ` Chris Friesen
0 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-16 21:26 UTC (permalink / raw)
To: Rik van Riel
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 11:12 AM, Rik van Riel wrote:
> On 02/16/2010 11:52 AM, Chris Friesen wrote:
>> On 02/15/2010 11:00 AM, Rik van Riel wrote:
>
>>> Removal from the LRU is done from the page freeing code, on
>>> the final free of the page.
>
>> There are a bunch of inline functions involved, but I think the chain
>> from page_remove_rmap() back up to unmap_vmas() looks like this:
>>
>> page_remove_rmap
>> zap_pte_range
>> zap_pmd_range
>> zap_pud_range
>> unmap_page_range
>> unmap_vmas
>>
>> So in this scenario, where do the pages actually get removed from the
>> LRU list (assuming that they're not in use by anyone else)?
>
> __page_cache_release
For the backtrace scenario I posted it seems like it might actually be
release_pages(). There seems to be a plausible call chain:
__ClearPageLRU
release_pages
free_pages_and_swap_cache
tlb_flush_mmu
tlb_remove_page
zap_pte_range
Does that seem right? In this case, tlb_remove_page() is called right
after page_remove_rmap() which ultimately results in clearing the
PageAnon bit.
Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-16 21:26 ` Chris Friesen
@ 2010-02-16 22:22 ` Rik van Riel
-1 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-16 22:22 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 04:26 PM, Chris Friesen wrote:
> For the backtrace scenario I posted it seems like it might actually be
> release_pages(). There seems to be a plausible call chain:
>
> __ClearPageLRU
> release_pages
> free_pages_and_swap_cache
> tlb_flush_mmu
> tlb_remove_page
> zap_pte_range
>
> Does that seem right? In this case, tlb_remove_page() is called right
> after page_remove_rmap() which ultimately results in clearing the
> PageAnon bit.
That is right - and pinpoints the fault for the memory leak
on some third party code that fails to release a refcount on
memory pages.
--
All rights reversed.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-16 22:22 ` Rik van Riel
0 siblings, 0 replies; 41+ messages in thread
From: Rik van Riel @ 2010-02-16 22:22 UTC (permalink / raw)
To: Chris Friesen
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 04:26 PM, Chris Friesen wrote:
> For the backtrace scenario I posted it seems like it might actually be
> release_pages(). There seems to be a plausible call chain:
>
> __ClearPageLRU
> release_pages
> free_pages_and_swap_cache
> tlb_flush_mmu
> tlb_remove_page
> zap_pte_range
>
> Does that seem right? In this case, tlb_remove_page() is called right
> after page_remove_rmap() which ultimately results in clearing the
> PageAnon bit.
That is right - and pinpoints the fault for the memory leak
on some third party code that fails to release a refcount on
memory pages.
--
All rights reversed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo? -- solved
2010-02-16 22:22 ` Rik van Riel
@ 2010-02-18 15:39 ` Chris Friesen
-1 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-18 15:39 UTC (permalink / raw)
To: Rik van Riel
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 04:22 PM, Rik van Riel wrote:
> On 02/16/2010 04:26 PM, Chris Friesen wrote:
>
>> For the backtrace scenario I posted it seems like it might actually be
>> release_pages(). There seems to be a plausible call chain:
>>
>> __ClearPageLRU
>> release_pages
>> free_pages_and_swap_cache
>> tlb_flush_mmu
>> tlb_remove_page
>> zap_pte_range
>>
>> Does that seem right? In this case, tlb_remove_page() is called right
>> after page_remove_rmap() which ultimately results in clearing the
>> PageAnon bit.
>
> That is right - and pinpoints the fault for the memory leak
> on some third party code that fails to release a refcount on
> memory pages.
I think I've tracked down the source of the problem. Turns out one of
our vendors had misapplied a patch which ended up bumping the page count
an extra time.
Thanks to everyone that helped out.
Chris
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo? -- solved
@ 2010-02-18 15:39 ` Chris Friesen
0 siblings, 0 replies; 41+ messages in thread
From: Chris Friesen @ 2010-02-18 15:39 UTC (permalink / raw)
To: Rik van Riel
Cc: Minchan Kim, KOSAKI Motohiro, Linux Kernel Mailing List, linux-mm,
Balbir Singh
On 02/16/2010 04:22 PM, Rik van Riel wrote:
> On 02/16/2010 04:26 PM, Chris Friesen wrote:
>
>> For the backtrace scenario I posted it seems like it might actually be
>> release_pages(). There seems to be a plausible call chain:
>>
>> __ClearPageLRU
>> release_pages
>> free_pages_and_swap_cache
>> tlb_flush_mmu
>> tlb_remove_page
>> zap_pte_range
>>
>> Does that seem right? In this case, tlb_remove_page() is called right
>> after page_remove_rmap() which ultimately results in clearing the
>> PageAnon bit.
>
> That is right - and pinpoints the fault for the memory leak
> on some third party code that fails to release a refcount on
> memory pages.
I think I've tracked down the source of the problem. Turns out one of
our vendors had misapplied a patch which ended up bumping the page count
an extra time.
Thanks to everyone that helped out.
Chris
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
2010-02-12 2:38 ` Minchan Kim
@ 2010-02-12 17:50 ` Catalin Marinas
-1 siblings, 0 replies; 41+ messages in thread
From: Catalin Marinas @ 2010-02-12 17:50 UTC (permalink / raw)
To: Minchan Kim
Cc: Chris Friesen, KOSAKI Motohiro, Rik van Riel,
Linux Kernel Mailing List, linux-mm, Balbir Singh
Minchan Kim <minchan.kim@gmail.com> wrote:
> On Fri, Feb 12, 2010 at 3:54 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>> I have a modified version of that which I picked up as part of the
>> kmemleak backport. However, it doesn't help unless I can narrow down
>> *which* pages I should care about.
>
> kmemleak doesn't support page allocator and ioremap.
> Above URL patch just can tell who requests page which is using(ie, not
> free) now.
The ioremap can be easily tracked by kmemleak (it is on my to-do list
but haven't managed to do it yet). That's not far from vmalloc.
The page allocator is a bit more difficult since it's used by the slab
allocator as well and it may lead to some recursive calls into
kmemleak. I'll have a think.
Anyway, you can leak memory without this being detected by kmemleak -
just add the allocated objects to a list and never remove them.
--
Catalin
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?
@ 2010-02-12 17:50 ` Catalin Marinas
0 siblings, 0 replies; 41+ messages in thread
From: Catalin Marinas @ 2010-02-12 17:50 UTC (permalink / raw)
To: Minchan Kim
Cc: Chris Friesen, KOSAKI Motohiro, Rik van Riel,
Linux Kernel Mailing List, linux-mm, Balbir Singh
Minchan Kim <minchan.kim@gmail.com> wrote:
> On Fri, Feb 12, 2010 at 3:54 AM, Chris Friesen <cfriesen@nortel.com> wrote:
>> I have a modified version of that which I picked up as part of the
>> kmemleak backport. However, it doesn't help unless I can narrow down
>> *which* pages I should care about.
>
> kmemleak doesn't support page allocator and ioremap.
> Above URL patch just can tell who requests page which is using(ie, not
> free) now.
The ioremap can be easily tracked by kmemleak (it is on my to-do list
but haven't managed to do it yet). That's not far from vmalloc.
The page allocator is a bit more difficult since it's used by the slab
allocator as well and it may lead to some recursive calls into
kmemleak. I'll have a think.
Anyway, you can leak memory without this being detected by kmemleak -
just add the allocated objects to a list and never remove them.
--
Catalin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 41+ messages in thread