From: Vlastimil Babka <vbabka@suse.cz>
To: Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.cz>,
linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Michel Lespinasse <walken@google.com>,
Andrea Argangeli <andrea@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Daniel Forrest <dan.forrest@ssec.wisc.edu>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: anon_vma accumulating for certain load still not addressed
Date: Fri, 14 Nov 2014 18:10:47 +0100 [thread overview]
Message-ID: <54663797.1060106@suse.cz> (raw)
In-Reply-To: <54661A8C.5050806@redhat.com>
On 11/14/2014 04:06 PM, Rik van Riel wrote:
> On 11/14/2014 08:08 AM, Michal Hocko wrote:
>> Hi,
>> back in 2012 [1] there was a discussion about a forking load which
>> accumulates anon_vmas. There was a trivial test case which triggers this
>> and can potentially deplete the memory by local user.
>>
>> We have a report for an older enterprise distribution where nsd is
>> suffering from this issue most probably (I haven't debugged it throughly
>> but accumulating anon_vma structs over time sounds like a good enough
>> fit) and has to be restarted after some time to release the accumulated
>> anon_vma objects.
>>
>> There was a patch which tried to work around the issue [2] but I do not
>> see any follow ups nor any indication that the issue would be addressed
>> in other way.
>>
>> The test program from [1] was running for around 39 mins on my laptop
>> and here is the result:
>>
>> $ date +%s; grep anon_vma /proc/slabinfo
>> 1415960225
>> anon_vma 11664 11900 160 25 1 : tunables 0 0 0 : slabdata 476 476 0
>>
>> $ ./a # The reproducer
>>
>> $ date +%s; grep anon_vma /proc/slabinfo
>> 1415962592
>> anon_vma 34875 34875 160 25 1 : tunables 0 0 0 : slabdata 1395 1395 0
>>
>> $ killall a
>> $ date +%s; grep anon_vma /proc/slabinfo
>> 1415962607
>> anon_vma 11277 12175 160 25 1 : tunables 0 0 0 : slabdata 487 487 0
>>
>> So we have accumulated 23211 objects over that time period before the
>> offender was killed which released all of them.
>>
>> The proposed workaround is kind of ugly but do people have a better idea
>> than reference counting? If not should we merge it?
>
> I believe we should just merge that patch.
>
> I have not seen any better ideas come by.
I have some very vague idea that if we could distinguish (with a flag?)
anon_vma_chain (avc) pointing to parent's anon_vma, from avc's created
for new anon_vma's in the child, we could maybe detect at "child-type"
avc removal time, that the only avc's left for a non-root anon_vma are
those of "parent-type" pointing from children. Then we could go through
all pages that map the anon_vma, and change their mapping to the root
anon_vma. The root would have to stay, orphaned or not, because of the
lock there.
That would remove the need for determining a magic constant and the
possibility that we still leave non-useful "orphaned" anon_vma's on the
top levels of the fork hierarchy, while all the bottom levels have to
share the last anon_vma's that were allowed to be created. I'm not sure
if that's the case of nsd - if besides the "orphaned parent" forks it
also forks some workers that would no longer benefit from having their
private anon_vma's.
Of course the downside is that the idea would be too complicated wrt
locking and incur overhead on some fast paths (process exit?). And I
admit I'm not very familiar with the code (which is perhaps euphemism :)
Still, what do you think, Rik?
Vlastimil
> The comment should probably be fixed to reflect the
> chain length of 5 though :)
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Rik van Riel <riel@redhat.com>, Michal Hocko <mhocko@suse.cz>,
linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Michel Lespinasse <walken@google.com>,
Andrea Argangeli <andrea@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Daniel Forrest <dan.forrest@ssec.wisc.edu>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: anon_vma accumulating for certain load still not addressed
Date: Fri, 14 Nov 2014 18:10:47 +0100 [thread overview]
Message-ID: <54663797.1060106@suse.cz> (raw)
In-Reply-To: <54661A8C.5050806@redhat.com>
On 11/14/2014 04:06 PM, Rik van Riel wrote:
> On 11/14/2014 08:08 AM, Michal Hocko wrote:
>> Hi,
>> back in 2012 [1] there was a discussion about a forking load which
>> accumulates anon_vmas. There was a trivial test case which triggers this
>> and can potentially deplete the memory by local user.
>>
>> We have a report for an older enterprise distribution where nsd is
>> suffering from this issue most probably (I haven't debugged it throughly
>> but accumulating anon_vma structs over time sounds like a good enough
>> fit) and has to be restarted after some time to release the accumulated
>> anon_vma objects.
>>
>> There was a patch which tried to work around the issue [2] but I do not
>> see any follow ups nor any indication that the issue would be addressed
>> in other way.
>>
>> The test program from [1] was running for around 39 mins on my laptop
>> and here is the result:
>>
>> $ date +%s; grep anon_vma /proc/slabinfo
>> 1415960225
>> anon_vma 11664 11900 160 25 1 : tunables 0 0 0 : slabdata 476 476 0
>>
>> $ ./a # The reproducer
>>
>> $ date +%s; grep anon_vma /proc/slabinfo
>> 1415962592
>> anon_vma 34875 34875 160 25 1 : tunables 0 0 0 : slabdata 1395 1395 0
>>
>> $ killall a
>> $ date +%s; grep anon_vma /proc/slabinfo
>> 1415962607
>> anon_vma 11277 12175 160 25 1 : tunables 0 0 0 : slabdata 487 487 0
>>
>> So we have accumulated 23211 objects over that time period before the
>> offender was killed which released all of them.
>>
>> The proposed workaround is kind of ugly but do people have a better idea
>> than reference counting? If not should we merge it?
>
> I believe we should just merge that patch.
>
> I have not seen any better ideas come by.
I have some very vague idea that if we could distinguish (with a flag?)
anon_vma_chain (avc) pointing to parent's anon_vma, from avc's created
for new anon_vma's in the child, we could maybe detect at "child-type"
avc removal time, that the only avc's left for a non-root anon_vma are
those of "parent-type" pointing from children. Then we could go through
all pages that map the anon_vma, and change their mapping to the root
anon_vma. The root would have to stay, orphaned or not, because of the
lock there.
That would remove the need for determining a magic constant and the
possibility that we still leave non-useful "orphaned" anon_vma's on the
top levels of the fork hierarchy, while all the bottom levels have to
share the last anon_vma's that were allowed to be created. I'm not sure
if that's the case of nsd - if besides the "orphaned parent" forks it
also forks some workers that would no longer benefit from having their
private anon_vma's.
Of course the downside is that the idea would be too complicated wrt
locking and incur overhead on some fast paths (process exit?). And I
admit I'm not very familiar with the code (which is perhaps euphemism :)
Still, what do you think, Rik?
Vlastimil
> The comment should probably be fixed to reflect the
> chain length of 5 though :)
>
next prev parent reply other threads:[~2014-11-14 17:10 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-14 13:08 anon_vma accumulating for certain load still not addressed Michal Hocko
2014-11-14 13:08 ` Michal Hocko
2014-11-14 15:06 ` Rik van Riel
2014-11-14 15:06 ` Rik van Riel
2014-11-14 17:10 ` Vlastimil Babka [this message]
2014-11-14 17:10 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54663797.1060106@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=andrea@kernel.org \
--cc=dan.forrest@ssec.wisc.edu \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.