From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Sukadev Bhattiprolu
<sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Linux Kernel Mailing List
<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
andrea-Vyt77T80VFVWk0Htik3J/w@public.gmane.org,
Linux Containers
<containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>,
Daniel Lezcano <dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>,
Pavel Emelianov <xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Subject: Re: pidns memory leak
Date: Fri, 09 Oct 2009 19:08:44 -0700 [thread overview]
Message-ID: <m1vdiouhjn.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20091010015859.GB11904-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> (Sukadev Bhattiprolu's message of "Fri\, 9 Oct 2009 18\:58\:59 -0700")
Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> Eric W. Biederman [ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org] wrote:
> | Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> |
> | > Andrea,
> | >
> | > We have been running a leak in child pid namespaces and some early debugging
> | > points to the following commit:
> | >
> | >>> commit 7766755a2f249e7e0dabc5255a0a3d151ff79821
> | >>> Author: Andrea Arcangeli <andrea-l3A5Bk7waGM@public.gmane.org>
> | >>> Date: Mon Feb 4 22:29:21 2008 -0800
> | >>>
> | >
> | > Reverting the commit seems to fix the leak but we need to do some more
> | > analysis (like the lstat() question Daniel has).
> |
> | Yes.
> |
> | That entire path is an optimization. It should not be needed for correct
> | operation. Although it may be responsible for some false positives.
> |
> | > However I have a basic question regarding the commit - the log mentions:
> | >
> | > > do_exit->release_task->mark_inode_dirty_sync->schedule() (will never
> | > > come back to run journal_stop)
> | >
> | > But release_task() calls shrink_dcache_parent() for a _procfs_ dentry. Does
> | > journal_stop() apply to procfs also ?
> |
> | The problem when the that PF_EXITING check was introduced is that
> | shrink_dcache_parent could shrink dcache entries for other
> | filesystems. Last I looked that is no longer the case and we can
> | remove that code.
>
> Ok.
>
> | As I recall proc_flush_task_mnt has a few other minor bugs as well that
> | could cause problems.
>
> Can you give me some more details on those bugs ? Reverting the commit
> seems to fix the problem.
>
> |
> | Ultimately what problems are you seeing?
>
> We are leaking 'struct pid', proc_inode, and 'struct pid_namespace', when
> container-init exits before its descendant processes. i.e when the
> container-init zaps its descendants and waits for them, it calls the
> proc_flush_task_mnt(), but then misses the shrink_dcache_parent() call due
> to the above commit.
>
> So the proc_inode is never deleted and the references to struct pid and
> pid_namespace never go away. Details of the leak are buried in the
> previous mail...
In should be the case that bloating up the dcache so that we get a general
shrink_dcache from the memory reclaim code will free the proc_inode and
the appropriate data structures. struct pid is supposed to be small and
safe to leak in rare circumstances.
It should be possible to trigger this condition by creating a pid namespace.
cd /proc/<pid>/ (where <pid> is some process in that pid namespace)
Terminating that pid namespace.
But you are still actively using the proc_inode and the struct pid for the
process that has been killed. Because a process has it as it's current
working directory.
Eric
WARNING: multiple messages have this Message-ID (diff)
From: ebiederm@xmission.com (Eric W. Biederman)
To: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Daniel Lezcano <dlezcano@fr.ibm.com>,
andrea@cpushare.com, Pavel Emelianov <xemul@openvz.org>,
Sukadev Bhattiprolu <sukadev@us.ibm.com>,
Linux Containers <containers@lists.osdl.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: pidns memory leak
Date: Fri, 09 Oct 2009 19:08:44 -0700 [thread overview]
Message-ID: <m1vdiouhjn.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20091010015859.GB11904@us.ibm.com> (Sukadev Bhattiprolu's message of "Fri\, 9 Oct 2009 18\:58\:59 -0700")
Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes:
> Eric W. Biederman [ebiederm@xmission.com] wrote:
> | Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes:
> |
> | > Andrea,
> | >
> | > We have been running a leak in child pid namespaces and some early debugging
> | > points to the following commit:
> | >
> | >>> commit 7766755a2f249e7e0dabc5255a0a3d151ff79821
> | >>> Author: Andrea Arcangeli <andrea@suse.de>
> | >>> Date: Mon Feb 4 22:29:21 2008 -0800
> | >>>
> | >
> | > Reverting the commit seems to fix the leak but we need to do some more
> | > analysis (like the lstat() question Daniel has).
> |
> | Yes.
> |
> | That entire path is an optimization. It should not be needed for correct
> | operation. Although it may be responsible for some false positives.
> |
> | > However I have a basic question regarding the commit - the log mentions:
> | >
> | > > do_exit->release_task->mark_inode_dirty_sync->schedule() (will never
> | > > come back to run journal_stop)
> | >
> | > But release_task() calls shrink_dcache_parent() for a _procfs_ dentry. Does
> | > journal_stop() apply to procfs also ?
> |
> | The problem when the that PF_EXITING check was introduced is that
> | shrink_dcache_parent could shrink dcache entries for other
> | filesystems. Last I looked that is no longer the case and we can
> | remove that code.
>
> Ok.
>
> | As I recall proc_flush_task_mnt has a few other minor bugs as well that
> | could cause problems.
>
> Can you give me some more details on those bugs ? Reverting the commit
> seems to fix the problem.
>
> |
> | Ultimately what problems are you seeing?
>
> We are leaking 'struct pid', proc_inode, and 'struct pid_namespace', when
> container-init exits before its descendant processes. i.e when the
> container-init zaps its descendants and waits for them, it calls the
> proc_flush_task_mnt(), but then misses the shrink_dcache_parent() call due
> to the above commit.
>
> So the proc_inode is never deleted and the references to struct pid and
> pid_namespace never go away. Details of the leak are buried in the
> previous mail...
In should be the case that bloating up the dcache so that we get a general
shrink_dcache from the memory reclaim code will free the proc_inode and
the appropriate data structures. struct pid is supposed to be small and
safe to leak in rare circumstances.
It should be possible to trigger this condition by creating a pid namespace.
cd /proc/<pid>/ (where <pid> is some process in that pid namespace)
Terminating that pid namespace.
But you are still actively using the proc_inode and the struct pid for the
process that has been killed. Because a process has it as it's current
working directory.
Eric
next prev parent reply other threads:[~2009-10-10 2:08 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-10-02 12:27 pidns memory leak Daniel Lezcano
2009-10-02 12:27 ` Daniel Lezcano
[not found] ` <4AC5F198.2070407-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2009-10-06 4:05 ` Sukadev Bhattiprolu
2009-10-06 4:05 ` Sukadev Bhattiprolu
[not found] ` <20091006040526.GA22923-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-06 8:18 ` Daniel Lezcano
2009-10-06 8:18 ` Daniel Lezcano
[not found] ` <4ACAFD6A.3060008-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2009-10-08 3:08 ` Sukadev Bhattiprolu
2009-10-08 3:08 ` Sukadev Bhattiprolu
[not found] ` <20091008030828.GA18973-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-08 8:11 ` Daniel Lezcano
2009-10-08 8:11 ` Daniel Lezcano
[not found] ` <4ACD9ECC.90508-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2009-10-09 3:29 ` Sukadev Bhattiprolu
2009-10-09 3:29 ` Sukadev Bhattiprolu
[not found] ` <20091009032928.GA2031-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-09 13:18 ` Daniel Lezcano
2009-10-09 13:18 ` Daniel Lezcano
[not found] ` <4ACF381F.9050808-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2009-10-09 20:38 ` Sukadev Bhattiprolu
2009-10-09 20:38 ` Sukadev Bhattiprolu
[not found] ` <20091009203809.GA12230-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-09 20:50 ` Eric W. Biederman
2009-10-09 20:50 ` Eric W. Biederman
[not found] ` <m18wfkl2bf.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-10-10 1:58 ` Sukadev Bhattiprolu
2009-10-10 1:58 ` Sukadev Bhattiprolu
[not found] ` <20091010015859.GB11904-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-10 2:08 ` Eric W. Biederman [this message]
2009-10-10 2:08 ` Eric W. Biederman
2009-10-09 21:54 ` Matt Helsley
2009-10-09 21:54 ` Matt Helsley
2009-10-10 1:32 ` Sukadev Bhattiprolu
2009-10-10 1:32 ` Sukadev Bhattiprolu
[not found] ` <20091010013235.GA11904-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-10-12 8:41 ` Daniel Lezcano
2009-10-12 8:41 ` Daniel Lezcano
[not found] ` <4AD2EBC7.2020109-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2009-10-14 6:15 ` Sukadev Bhattiprolu
2009-10-14 6:15 ` Sukadev Bhattiprolu
[not found] ` <20091014061533.GA23569-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-11-02 21:33 ` Andrew Morton
2009-11-02 21:33 ` Andrew Morton
2009-11-02 22:38 ` Serge E. Hallyn
2009-11-02 22:47 ` Andrew Morton
2009-11-03 7:24 ` Cedric Le Goater
2009-11-03 8:41 ` Eric W. Biederman
2009-11-03 9:24 ` Cedric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1vdiouhjn.fsf@fess.ebiederm.org \
--to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
--cc=andrea-Vyt77T80VFVWk0Htik3J/w@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=xemul-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.